On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote: > Jeff Janes <jeff.janes@xxxxxxxxx> writes: > > The growEnabled stuff only prevents infinite loops. It doesn't prevent > > extreme silliness. > > > If a single 32 bit hash value has enough tuples by itself to not fit in > > work_mem, then it will keep splitting until that value is in a batch by > > itself before shutting off > > I suspect, however, that we might be better off just taking the existence > of the I/O buffers into account somehow while deciding whether it's worth > growing further. That is, I'm imagining adding a second independent > reason for shutting off growEnabled, along the lines of "increasing > nbatch any further will require an unreasonable amount of buffer memory". > The question then becomes how to define "unreasonable". On Sun, Apr 21, 2019 at 06:15:25PM +0200, Tomas Vondra wrote: > I think the question the code needs to be asking is "If we double the > number of batches, does the amount of memory we need drop?" And the > memory needs to account both for the buffers and per-batch data. > > I don't think we can just stop increasing the number of batches when the > memory for BufFile exceeds work_mem, because that entirely ignores the > fact that by doing that we force the system to keep the per-batch stuff > in memory (and that can be almost arbitrary amount). ... > Of course, this just stops enforcing work_mem at some point, but it at > least attempts to minimize the amount of memory used. This patch defines reasonable as "additional BatchFiles will not themselves exceed work_mem; OR, exceeded work_mem already but additional BatchFiles are going to save us RAM"... I think the first condition is insensitive and not too important to get right, it only allows work_mem to be exceeded by 2x, which maybe already happens for multiple reasons, related to this thread and otherwise. It'd be fine to slap on a factor of /2 or /4 or /8 there too. The current patch doesn't unset growEnabled, since there's no point at which the hashtable should grow without bound: if hash tables are *already* exceeding work_mem by 2x as big, nbatches should be doubled. Justin
>From 7f303c3635adabe2d75f5bd3cb6a6b2ab5102012 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryzbyj@xxxxxxxxxxxxx> Date: Sun, 21 Apr 2019 05:04:09 -0500 Subject: [PATCH v1] account for size of BatchFile structure in hashJoin Avoid runaway number of Batches, overhead of which was previously not counted. The size of BatchFiles is now included in instrumentation for output of explain analyze. Analyzed-By: Tom Lane, Tomas Vondra, Justin Pryzby Discussion: bc138e9f-c89e-9147-5395-61d51a757b3b@xxxxxxxx --- src/backend/executor/nodeHash.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c index ee93030..9a87265 100644 --- a/src/backend/executor/nodeHash.c +++ b/src/backend/executor/nodeHash.c @@ -1670,10 +1670,33 @@ ExecHashTableInsert(HashJoinTable hashtable, hashtable->spaceUsed += hashTupleSize; if (hashtable->spaceUsed > hashtable->spacePeak) hashtable->spacePeak = hashtable->spaceUsed; + + /* + * If addition of 2x as many batch files would cause the structure itself + * (without the hashtable) to exceed spaceAllowed, then do not grow...unless the + * spaceUsed has already exceeded Allowed by more than 2x... + * Note, spaceUsed doesn't include the overhead of BatchFile structure. + */ if (hashtable->spaceUsed + hashtable->nbuckets_optimal * sizeof(HashJoinTuple) - > hashtable->spaceAllowed) - ExecHashIncreaseNumBatches(hashtable); + // + hashtable->nbatch * sizeof(PGAlignedBlock) + > hashtable->spaceAllowed) { + if (hashtable->nbatch*sizeof(PGAlignedBlock) < hashtable->spaceAllowed) { + ExecHashIncreaseNumBatches(hashtable); + } else if (hashtable->spaceUsed/2 >= hashtable->spaceAllowed) { + /* Exceeded spaceAllowed by 2x, so we'll save RAM by allowing nbatches to increase */ + /* I think this branch would be hit almost same as below branch */ + ExecHashIncreaseNumBatches(hashtable); + } else if (hashtable->nbatch*sizeof(PGAlignedBlock) < hashtable->spaceUsed) { + /* + * spaceUsed is so large that it will save RAM to allow the BatchFile structures themselves to exceed work_mem. + * Increasing batches and exceeding work mem will use less space and be less bad than not increasing... + */ + elog(WARNING, "exceeding work mem: used:%ld allowed:%ld add:%ld", + hashtable->spaceUsed, hashtable->spaceAllowed, hashtable->nbatch*sizeof(PGAlignedBlock)); + ExecHashIncreaseNumBatches(hashtable); + } + } } else { @@ -2693,6 +2716,8 @@ ExecHashGetInstrumentation(HashInstrumentation *instrument, instrument->nbatch = hashtable->nbatch; instrument->nbatch_original = hashtable->nbatch_original; instrument->space_peak = hashtable->spacePeak; + /* Finally, account for size of structure: */ + instrument->space_peak += hashtable->nbatch * sizeof(PGAlignedBlock); } /* -- 2.7.4