In keeping with some of the recent threads regarding the planner... I have a fair sized data warehouse in which I am trying to perform an aggregation, but getting OOM errors in Postgres(8.2.4). I believe the reason for the OOM is that Postgres is attempting to do a hash aggregation, but it has grossly underestimated the rows resulting from the aggregation. The data in the database is very uniformly distributed so I don't believe that the table stats are the cause of the problem. This may be related to table inheritance, and can be demonstrated pretty easily. CREATE TABLE foo(a INT); ANALYZE foo; CREATE TABLE foo_1() INHERITS(foo); insert into foo_1 select generate_series(1,100000); insert into foo_1 select generate_series(1,100000); insert into foo_1 select generate_series(1,100000); ANALYZE foo_1; CREATE TABLE foo_2() INHERITS(foo); insert into foo_2 select generate_series(1,100000); insert into foo_2 select generate_series(1,100000); insert into foo_2 select generate_series(1,100000); ANALYZE foo_2; -- If I query a particular partition, the plan estimate for the hash aggregate is good EXPLAIN ANALYZE SELECT a,COUNT(*) from foo_1 group by a; HashAggregate (cost=5822.00..7061.01 rows=99121 width=4) (actual time=554.556..657.121 rows=100000 loops=1) -> Seq Scan on foo_1 (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.014..203.290 rows=300000 loops=1) Total runtime: 712.211 ms -- If I query the base table, the plan estimate for the hash aggregate is off by several orders of magnitude EXPLAIN ANALYZE SELECT a,COUNT(*) from foo group by a; HashAggregate (cost=11686.10..11688.60 rows=200 width=4) (actual time=1724.188..1826.630 rows=100000 loops=1) -> Append (cost=0.00..8675.40 rows=602140 width=4) (actual time=0.016..1045.134 rows=600000 loops=1) -> Seq Scan on foo (cost=0.00..31.40 rows=2140 width=4) (actual time=0.001..0.001 rows=0 loops=1) -> Seq Scan on foo_1 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.012..205.130 rows=300000 loops=1) -> Seq Scan on foo_2 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.011..203.542 rows=300000 loops=1) Total runtime: 1879.550 ms (6 rows) -- Is there something magical about the hash aggregate estimate of 200 rows? -- I can have 30,000 or 300,000 rows in each child partition table and multiple partition's with different values of "a" and yet it always come up with 200. -- eg. create table foo_3() inherits(foo); insert into foo_3 select generate_series(100000,300000); analyze foo_3; EXPLAIN ANALYZE SELECT a,COUNT(*) from foo group by a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=15568.12..15570.62 rows=200 width=4) (actual time=2386.813..2691.254 rows=300000 loops=1) -> Append (cost=0.00..11557.41 rows=802141 width=4) (actual time=0.016..1403.121 rows=800001 loops=1) -> Seq Scan on foo (cost=0.00..31.40 rows=2140 width=4) (actual time=0.002..0.002 rows=0 loops=1) -> Seq Scan on foo_1 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.013..201.549 rows=300000 loops=1) -> Seq Scan on foo_2 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.010..211.332 rows=300000 loops=1) -> Seq Scan on foo_3 foo (cost=0.00..2882.01 rows=200001 width=4) (actual time=0.011..137.262 rows=200001 loops=1) Total runtime: 2851.990 ms Is this a bug, or some subtlety of the Postgres query planner? In my particular case, I am doing a join out to another small table as part of the aggregation and using constraint exclusion on the partitions, but I believe the cause of my problem is the same. I am running on 64bit FreeBSD/Postgres 8.2.4 on a machine with 8GB of memory. An explain of the query and resulting OOM diagnostics follow: The aggregation will result in 5,000,000 rows, not 5,000. In the stats_dtl table there are 12 observations(hourly) for each customer There are 125 different policy_id and 25 different policy_group_id's Policy's and policy_groups are even distributed across all customers userquery Scan table_4760 (cost=2897243.60..2897418.60 rows=5000 width=152) -> HashAggregate (cost=2897243.60..2897368.60 rows=5000 width=40) -> Hash Join (cost=7.81..2241002.00 rows=37499520 width=40) Hash Cond: (public.customer_stats_dtl.policy_id = policy.policy_id) -> Append (cost=0.00..1641001.87 rows=59999232 width=40) -> Seq Scan on customer_stats_dtl (cost=0.00..22.45 rows=4 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Seq Scan on customer_stats_dtl_027 customer_stats_dtl (cost=0.00..1640979.42 rows=59999228 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Hash (cost=6.25..6.25 rows=125 width=8) -> Seq Scan on policy (cost=0.00..6.25 rows=125 width=8) TopMemoryContext: 268400 total in 32 blocks; 20912 free (43 chunks); 247488 used unnamed prepared statement: 2097152 total in 8 blocks; 721456 free (2 chunks); 1375696 used TopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 used SPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 used SPI Plan: 7168 total in 3 blocks; 1312 free (0 chunks); 5856 used SPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 used SPI Plan: 3072 total in 2 blocks; 928 free (0 chunks); 2144 used SPI Plan: 7168 total in 3 blocks; 3920 free (0 chunks); 3248 used SPI Plan: 3072 total in 2 blocks; 1224 free (0 chunks); 1848 used PL/PgSQL function context: 24576 total in 2 blocks; 12008 free (9 chunks); 12568 used CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used Rendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PLpgSQL function cache: 24224 total in 2 blocks; 3744 free (0 chunks); 20480 used Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used S_2: 1024 total in 1 blocks; 488 free (0 chunks); 536 used S_1: 1024 total in 1 blocks; 488 free (0 chunks); 536 used Prepared Queries: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Record information cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used MessageContext: 8192 total in 1 blocks; 7744 free (0 chunks); 448 used Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used smgr relation table: 57344 total in 3 blocks; 21872 free (8 chunks); 35472 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1 blocks; 7888 free (1 chunks); 304 used PortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 used ExecutorState: 122880 total in 4 blocks; 68728 free (2 chunks); 54152 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 32888 total in 3 blocks; 14512 free (0 chunks); 18376 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used AggContext: 1033887744 total in 133 blocks; 688 free (0 chunks); 1033887056 used ** TupleHashTable: 566485040 total in 78 blocks; 2064896 free (304 chunks); 564420144 used ** ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8040 free (1 chunks); 152 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used Relcache by OID: 57344 total in 3 blocks; 31024 free (6 chunks); 26320 used CacheMemoryContext: 4679720 total in 23 blocks; 404104 free (1 chunks); 4275616 used If you have actually read this far, the following wrinkle is where I am currently stuck. I set ENABLE_HASHAGG=OFF and still get an OOM :-( In this situation Postgres runs for much longer before getting the OOM and I believe it has actually written some records to the aggregation I base this on seeing autovacuum kick off against the aggregation table after the query fails, which may not be valid. userquery Scan table_4760 (cost=8771233.77..9521474.17 rows=5000 width=152) -> GroupAggregate (cost=8771233.77..9521349.17 rows=5000 width=40) -> Sort (cost=8771233.77..8864982.57 rows=37499520 width=40) Sort Key: public.customer_stats_dtl.user_id, policy.policy_group_id -> Hash Join (cost=7.81..2241002.00 rows=37499520 width=40) Hash Cond: (public.customer_stats_dtl.policy_id = policy.policy_id) -> Append (cost=0.00..1641001.87 rows=59999232 width=40) -> Seq Scan on customer_stats_dtl (cost=0.00..22.45 rows=4 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Seq Scan on customer_stats_dtl_027 customer_stats_dtl (cost=0.00..1640979.42 rows=59999228 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Hash (cost=6.25..6.25 rows=125 width=8) -> Seq Scan on policy (cost=0.00..6.25 rows=125 width=8) TopMemoryContext: 874608 total in 106 blocks; 14240 free (45 chunks); 860368 used unnamed prepared statement: 2097152 total in 8 blocks; 720544 free (2 chunks); 1376608 used TopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 used AfterTriggerEvents: 758112256 total in 102 blocks; 1792 free (8 chunks); 758110464 used S_2: 1024 total in 1 blocks; 488 free (0 chunks); 536 used Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used SPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 used SPI Plan: 7168 total in 3 blocks; 1312 free (0 chunks); 5856 used SPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 used SPI Plan: 3072 total in 2 blocks; 928 free (0 chunks); 2144 used SPI Plan: 7168 total in 3 blocks; 3920 free (0 chunks); 3248 used SPI Plan: 3072 total in 2 blocks; 1224 free (0 chunks); 1848 used PL/PgSQL function context: 24576 total in 2 blocks; 12008 free (9 chunks); 12568 used CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used Rendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PLpgSQL function cache: 24224 total in 2 blocks; 3744 free (0 chunks); 20480 used S_1: 1024 total in 1 blocks; 488 free (0 chunks); 536 used Prepared Queries: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used Record information cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used MessageContext: 8192 total in 1 blocks; 7744 free (0 chunks); 448 used Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used smgr relation table: 253952 total in 5 blocks; 40912 free (16 chunks); 213040 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1 blocks; 7888 free (1 chunks); 304 used PortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 used ExecutorState: 122880 total in 4 blocks; 61720 free (15 chunks); 61160 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 32888 total in 3 blocks; 14512 free (0 chunks); 18376 used TupleSort: 822437960 total in 105 blocks; 424201336 free (5267294 chunks); 398236624 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used AggContext: 8192 total in 1 blocks; 8000 free (3 chunks); 192 used ExprContext: 8192 total in 1 blocks; 8000 free (0 chunks); 192 used ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used ExprContext: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used Relcache by OID: 253952 total in 5 blocks; 117488 free (8 chunks); 136464 used CacheMemoryContext: 17262632 total in 25 blocks; 3573056 free (0 chunks); 13689576 used Explore the seven wonders of the world Learn more! |