Search Postgresql Archives

Re: Bad planning data resulting in OOM killing of postgres

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I wrote:
> David Hinkle <hinkle@xxxxxxxxxxxxxx> writes:
>> Thanks guys, here's the information you requested:
>> psql:postgres@cipafilter = show work_mem;
>> work_mem
>> ──────────
>> 10MB
>> (1 row)

> [ squint... ]  It should absolutely not have tried to hash a 500M-row
> table if it thought work_mem was only 10MB.  I wonder if there's an
> integer-overflow problem or something like that.

Ah, nah, scratch that: I was thinking of the hash aggregation case,
where there's no run-time ability to spill to disk so the planner
will not risk using hash aggregation if it estimates the hash table
would exceed work_mem.  Hash joins do have the ability to restrict
memory consumption by increasing the number of batches, so the planner
doesn't worry about it in that case.

I think what must be happening is that there's some one value of
log_raw.titleid that occurs a ridiculous number of times, so that
the executor is unable to split up that particular hash bucket,
leading to OOM when it tries to load all those rows to process
the hash bucket.

The planner does attempt to estimate the worst-case bucket size,
but in what now seems like brain fade, it doesn't do more with that
information than charge an appropriate number of tuple comparisons.
That would somewhat discourage use of a hash join, but a merge join
on this many tuples would be pretty expensive too, so it's not overly
surprising that it went with hashing anyway.

I am thinking we ought to fix it so it rejects (or at least heavily
penalizes) a hash join if it estimates that the rows containing
the inner side's most common value wouldn't all fit in work_mem.

What's not completely certain however is whether this diagnosis is
accurate for your case, or whether the proposed remedy would fix it.
It would be useful to see the contents of pg_stats.most_common_freqs
for log_raw.titleid.

			regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux