Best Practices for Managing PostgreSQL Statistics
Introduction
Updating Statistics
Last Stats Update
default_statistics_target
Viewing Statistics
Introduction
In PostgreSQL, statistics play a critical role in optimizing query performance. They provide essential information to the query planner, aiding in efficient query execution by estimating the number of rows and choosing optimal query plans.
Updating Statistics
PostgreSQL provides the ANALYZE command to update statistics:
Analyze all tables in the current database:
ANALYZE VERBOSE;
Analyze a specific table:
ANALYZE VERBOSE table_name;
Analyze specific columns:
ANALYZE VERBOSE table_name (column_name1, column_name2);
VERBOSE enables the display of progress messages during the analyze operation.
Last Stats Update
The following SQL shows the last time each table has been analyzed, manually or automatically, the number of modifications (inserts, updates, deletes) since the last analyze. Lastly, it shows the number of live and dead rows, along with the percentage of dead rows calculated.
SELECT
relname,
last_analyze,
last_autoanalyze,
analyze_count,
autoanalyze_count,
n_mod_since_analyze,
n_live_tup,
n_dead_tup,
CASE
WHEN n_live_tup > 0 THEN ROUND(n_dead_tup * 1.0 / n_live_tup * 100, 2)
ELSE 0
END AS pct_dead_tup
FROM
pg_stat_user_tables
ORDER BY
relname;
default_statistics_target
Indicates how many values are stored in the list of most common values and specifies the number of rows to be inspected by ANALYZE
.
Raising the limit may allow for more accurate query planner estimates, particularly for columns with irregular data distributions. This increases the space used to store the statistics and slightly extends the computation time.
The default value is 100, with a maximum of 10,000.
The number of rows to be analyzed is determined by:
300 * default_statistics_target
To explicitly set statistics target value for a column, run the following SQL.
ALTER TABLE table_name ALTER COLUMN column_name SET STATISTICS 1000;
To view column settings -
select
c.relname,
a.attname,
a.attstattarget,
a.attoptions
from
pg_class c
inner join
pg_attribute a
on c.oid = a.attrelid
where
c.relname = 'table_name'
order by
c.relname,
a.attname;
Viewing Statistics
The view pg_stats
provides access to the information stored in the pg_statistic
catalog.
To view statistics for a single table
SELECT * FROM pg_stats WHERE tablename = 'table_name';
To view statistics for all tables in a schema:
SELECT * FROM pg_stats WHERE schemaname = 'public' ORDER BY tablename, attname;
Conclusion
Managing PostgreSQL statistics is crucial for optimizing database performance. By understanding how to update, interpret, and utilize these statistics, you can improve query planning and overall system efficiency.
Related content