I have a postgres 8.4.5 instance on CentOS 5 (x86_64) which appears to go crazy with the amount of memory it consumes. When I run the query below, in a matter of a few seconds memory balloons to 5.3G (virtual), 4.6G (resident) and 1840 (shared), and eventually the oom killer is invoked, killing the entire process. Physical memory is 8GB but other processes on the box consume approximately 4GB of that. The settings changed from their defaults: effective_cache_size = 4GB work_mem = 16MB maintenance_work_mem = 128MB wal_buffers = 16MB checkpoint_segments = 16 shared_buffers = 384MB checkpoint_segments = 64 and default_statistics_target = 100 The query is this: insert into d_2010_09_13_sum select FOO.i, FOO.n, sum(FOO.cc) as cc, sum(FOO.oc) as oc from ( select * from d_2010_09_12_sum union all select * from d_2010_09_13 ) AS FOO group by i, n; here is the explain: Subquery Scan "*SELECT*" (cost=1200132.06..1201332.06 rows=40000 width=80) -> HashAggregate (cost=1200132.06..1200732.06 rows=40000 width=41) -> Append (cost=0.00..786531.53 rows=41360053 width=41) -> Seq Scan on d_2010_09_12_sum (cost=0.00..520066.48 rows=27272648 width=42) -> Seq Scan on d_2010_09_13 (cost=0.00..266465.05 rows=14087405 width=40) Both source tables freshly vacuum analyze'd. The row estimates are correct for both source tables. If I use "set enable_hashagg = false" I get this plan: Subquery Scan "*SELECT*" (cost=8563632.73..9081838.25 rows=40000 width=80) -> GroupAggregate (cost=8563632.73..9081238.25 rows=40000 width=41) -> Sort (cost=8563632.73..8667033.84 rows=41360441 width=41) Sort Key: d_2010_09_12_sum.i, d_2010_09_12_sum.n -> Result (cost=0.00..786535.41 rows=41360441 width=41) -> Append (cost=0.00..786535.41 rows=41360441 width=41) -> Seq Scan on d_2010_09_12_sum (cost=0.00..520062.04 rows=27272204 width=42) -> Seq Scan on d_2010_09_13 (cost=0.00..266473.37 rows=14088237 width=40) and postmaster's memory never exceeds (roughly) 548M (virtual), 27M (resident), 5M (shared). I even set default_statistics_target to 1000 and re-ran "vacuum analyze verbose" on both tables - no change. If I set work_mem to 1MB (from 16MB) then the GroupAggregate variation is chosen instead. Experimentally, HashAggregate is chosen when work_mem is 16MB, 8MB, 6MB, 5MB but not 4MB and on down. Two things I don't understand: 1. Why, when hash aggregation is allowed, does memory absolutely explode (eventually invoking the wrath of the oom killer). 16MB for work_mem does not seem outrageously high. For that matter, neither does 5MB. 2. Why do both HashAggregate and GroupAggregate say the cost estimate is 40000 rows? -- Jon -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance