I wrote: > David Hinkle <hinkle@xxxxxxxxxxxxxx> writes: >> Thanks guys, here's the information you requested: >> psql:postgres@cipafilter = show work_mem; >> work_mem >> ────────── >> 10MB >> (1 row) > [ squint... ] It should absolutely not have tried to hash a 500M-row > table if it thought work_mem was only 10MB. I wonder if there's an > integer-overflow problem or something like that. Ah, nah, scratch that: I was thinking of the hash aggregation case, where there's no run-time ability to spill to disk so the planner will not risk using hash aggregation if it estimates the hash table would exceed work_mem. Hash joins do have the ability to restrict memory consumption by increasing the number of batches, so the planner doesn't worry about it in that case. I think what must be happening is that there's some one value of log_raw.titleid that occurs a ridiculous number of times, so that the executor is unable to split up that particular hash bucket, leading to OOM when it tries to load all those rows to process the hash bucket. The planner does attempt to estimate the worst-case bucket size, but in what now seems like brain fade, it doesn't do more with that information than charge an appropriate number of tuple comparisons. That would somewhat discourage use of a hash join, but a merge join on this many tuples would be pretty expensive too, so it's not overly surprising that it went with hashing anyway. I am thinking we ought to fix it so it rejects (or at least heavily penalizes) a hash join if it estimates that the rows containing the inner side's most common value wouldn't all fit in work_mem. What's not completely certain however is whether this diagnosis is accurate for your case, or whether the proposed remedy would fix it. It would be useful to see the contents of pg_stats.most_common_freqs for log_raw.titleid. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general