Re: How to tune Postgres to take advantage of 256GB RAM hardware

Tom Lane <tgl@xxxxxxxxxxxxx> · Thu, 24 Nov 2016 13:40:34 -0500

Carmen Mardiros <bluecorr@xxxxxxxxx> writes:
> I've then tried different settings for work_mem, not changing anything else.
> work_mem = 400MB -> query runs fine but memory usage in the system doesn't
> exceed 1.3%
> work_mem = 500MB -> usage hits 100% and postgres crashes out of memory.

I suspect what may be happening is that when you push work_mem to >=
500MB, the planner decides it can replace the GroupAgg step with a
HashAgg, which tries to form all the aggregate results at once in memory.
Because of the drastic underestimate of the number of groups
(2.7 mil vs 27 mil actual), the hash table is much bigger than the planner
is expecting, causing memory consumption to bloat way beyond what it
should be.

You could confirm this idea by seeing if the EXPLAIN output changes that
way depending on work_mem.  (Use plain EXPLAIN, not EXPLAIN ANALYZE, so
you don't actually run out of memory while experimenting.)  If it's true,
you might be able to improve the group-count estimate by increasing the
statistics target for ANALYZE.

However, the group-count underestimate only seems to be a factor of 10,
so you'd still expect the memory usage to not be more than 5GB if the
planner were getting it right otherwise.  So there may be something
else wrong, maybe a plain old memory leak.

Can you generate a self-contained example that causes similar memory
overconsumption?  I'm guessing the behavior isn't very sensitive to
the exact data you're using as long as the group counts are similar,
so maybe you could post a script that generates junk test data that
causes this, rather than needing 27M rows of real data.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance