On Thu, Jul 22, 2021 at 01:32:51PM +0200, Kamil Frydel wrote: > table_1 and table_2 are hash partitioned using volume_id column. Usually we > make analyze on partitions. We do not make analyze on the partitioned table > (parent). > However, if we run 'analyze' on the partitioned table then planner starts > choosing hash join. As a comparison, the execution using nested loop takes > about 15 minutes and if it is done using hash join then the query lasts for > about 1 minute. When running 'analyze' for the partitioned table, postgres > inserts statistics for the partitioned table into pg_stats (pg_statistics). > Before that, there are only statistics for partitions. We suspect that this > is the reason for selecting bad query plan. > updated, the inheritance statistics will not be up to date unless you run > ANALYZE manually. > (https://www.postgresql.org/docs/13/sql-analyze.html) > > I would appreciate if anyone could shed some light on the following > questions: > 1) Is this above paragraph from docs still valid in PG 13 and does it apply > to declarative partitioning as well? Is running analyze manually on a > partitioned table needed to get proper plans for queries on partitioned > tables? Partitioned table (in the declarative way) is ”virtual” and does not > keep any data so it seems that there are no statistics that can be gathered > from the table itself and statistics from partitions should be sufficient. Up through v13, autoanalyze doesn't collect stats on parent tables (neither declarative nor inheritence). I agree that this doesn't seem to be well documented. I think it should also be mentioned here: https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-STATISTICS In v14 (which is currently in beta), autoanalyze will process the partitioned table automatically: https://www.postgresql.org/docs/14/release-14.html |Autovacuum now analyzes partitioned tables (Yuzuko Hosoya, Álvaro Herrera) |Insert, update, and delete tuple counts from partitions are now propagated to their parent tables so autovacuum knows when to process them. > 2) Why does the planner need these statistics since they seem to be unused > in the query plan. The query plan uses only partitions, not the partitioned > table. The "inherited" stats are used when you SELECT FROM table. The stats for the individual table would be needed when you SELECT FROM ONLY table (which makes no sense for a partitioned table). -- Justin