postmaster consuming /lots/ of memory with hash aggregate. why?

Jon Nelson <jnelson+pgsql@xxxxxxxxxxx> · Fri, 5 Nov 2010 19:26:48 -0500

I have a postgres 8.4.5 instance on CentOS 5 (x86_64) which appears to
go crazy with the amount of memory it consumes.
When I run the query below, in a matter of a few seconds memory
balloons to 5.3G (virtual), 4.6G (resident) and 1840 (shared), and
eventually the oom killer is invoked, killing the entire process.

Physical memory is 8GB but other processes on the box consume
approximately 4GB of that.

The settings changed from their defaults:

effective_cache_size = 4GB
work_mem = 16MB
maintenance_work_mem = 128MB
wal_buffers = 16MB
checkpoint_segments = 16
shared_buffers = 384MB
checkpoint_segments = 64

and

default_statistics_target = 100

The query is this:

insert into d_2010_09_13_sum
        select FOO.i, FOO.n, sum(FOO.cc) as cc, sum(FOO.oc) as oc
        from (
          select * from d_2010_09_12_sum
          union all
          select * from d_2010_09_13
        ) AS FOO group by i, n;

here is the explain:

 Subquery Scan "*SELECT*"  (cost=1200132.06..1201332.06 rows=40000 width=80)
   ->  HashAggregate  (cost=1200132.06..1200732.06 rows=40000 width=41)
         ->  Append  (cost=0.00..786531.53 rows=41360053 width=41)
               ->  Seq Scan on d_2010_09_12_sum  (cost=0.00..520066.48
rows=27272648 width=42)
               ->  Seq Scan on d_2010_09_13  (cost=0.00..266465.05
rows=14087405 width=40)

Both source tables freshly vacuum analyze'd.
The row estimates are correct for both source tables.

If I use "set enable_hashagg = false" I get this plan:

 Subquery Scan "*SELECT*"  (cost=8563632.73..9081838.25 rows=40000 width=80)
   ->  GroupAggregate  (cost=8563632.73..9081238.25 rows=40000 width=41)
         ->  Sort  (cost=8563632.73..8667033.84 rows=41360441 width=41)
               Sort Key: d_2010_09_12_sum.i, d_2010_09_12_sum.n
               ->  Result  (cost=0.00..786535.41 rows=41360441 width=41)
                     ->  Append  (cost=0.00..786535.41 rows=41360441 width=41)
                           ->  Seq Scan on d_2010_09_12_sum
(cost=0.00..520062.04 rows=27272204 width=42)
                           ->  Seq Scan on d_2010_09_13
(cost=0.00..266473.37 rows=14088237 width=40)

and postmaster's memory never exceeds (roughly) 548M (virtual), 27M
(resident), 5M (shared).

I even set default_statistics_target to 1000 and re-ran "vacuum
analyze verbose" on both tables - no change.
If I set work_mem to 1MB (from 16MB) then the GroupAggregate variation
is chosen instead.
Experimentally, HashAggregate is chosen when work_mem is 16MB, 8MB,
6MB, 5MB but not 4MB and on down.

Two things I don't understand:

1. Why, when hash aggregation is allowed, does memory absolutely
explode (eventually invoking the wrath of the oom killer). 16MB for
work_mem does not seem outrageously high. For that matter, neither
does 5MB.

2. Why do both HashAggregate and GroupAggregate say the cost estimate
is 40000 rows?

-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance