Re: Out of Memory errors are frustrating as heck!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote:
> Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> > The growEnabled stuff only prevents infinite loops.  It doesn't prevent
> > extreme silliness.
> 
> > If a single 32 bit hash value has enough tuples by itself to not fit in
> > work_mem, then it will keep splitting until that value is in a batch by
> > itself before shutting off
> 
> I suspect, however, that we might be better off just taking the existence
> of the I/O buffers into account somehow while deciding whether it's worth
> growing further.  That is, I'm imagining adding a second independent
> reason for shutting off growEnabled, along the lines of "increasing
> nbatch any further will require an unreasonable amount of buffer memory".
> The question then becomes how to define "unreasonable".

On Sun, Apr 21, 2019 at 06:15:25PM +0200, Tomas Vondra wrote:
> I think the question the code needs to be asking is "If we double the
> number of batches, does the amount of memory we need drop?" And the
> memory needs to account both for the buffers and per-batch data.
> 
> I don't think we can just stop increasing the number of batches when the
> memory for BufFile exceeds work_mem, because that entirely ignores the
> fact that by doing that we force the system to keep the per-batch stuff
> in memory (and that can be almost arbitrary amount).
...
> Of course, this just stops enforcing work_mem at some point, but it at
> least attempts to minimize the amount of memory used.

This patch defines reasonable as "additional BatchFiles will not themselves
exceed work_mem; OR, exceeded work_mem already but additional BatchFiles are
going to save us RAM"...

I think the first condition is insensitive and not too important to get right,
it only allows work_mem to be exceeded by 2x, which maybe already happens for
multiple reasons, related to this thread and otherwise.  It'd be fine to slap
on a factor of /2 or /4 or /8 there too. 

The current patch doesn't unset growEnabled, since there's no point at which
the hashtable should grow without bound: if hash tables are *already* exceeding
work_mem by 2x as big, nbatches should be doubled.

Justin
>From 7f303c3635adabe2d75f5bd3cb6a6b2ab5102012 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@xxxxxxxxxxxxx>
Date: Sun, 21 Apr 2019 05:04:09 -0500
Subject: [PATCH v1] account for size of BatchFile structure in hashJoin

Avoid runaway number of Batches, overhead of which was previously not counted.
The size of BatchFiles is now included in instrumentation for output of explain
analyze.

Analyzed-By: Tom Lane, Tomas Vondra, Justin Pryzby
Discussion: bc138e9f-c89e-9147-5395-61d51a757b3b@xxxxxxxx
---
 src/backend/executor/nodeHash.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index ee93030..9a87265 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -1670,10 +1670,33 @@ ExecHashTableInsert(HashJoinTable hashtable,
 		hashtable->spaceUsed += hashTupleSize;
 		if (hashtable->spaceUsed > hashtable->spacePeak)
 			hashtable->spacePeak = hashtable->spaceUsed;
+
+		/*
+		 * If addition of 2x as many batch files would cause the structure itself
+		 * (without the hashtable) to exceed spaceAllowed, then do not grow...unless the
+		 * spaceUsed has already exceeded Allowed by more than 2x...
+		 * Note, spaceUsed doesn't include the overhead of BatchFile structure.
+		 */
 		if (hashtable->spaceUsed +
 			hashtable->nbuckets_optimal * sizeof(HashJoinTuple)
-			> hashtable->spaceAllowed)
-			ExecHashIncreaseNumBatches(hashtable);
+			// + hashtable->nbatch * sizeof(PGAlignedBlock)
+			> hashtable->spaceAllowed) {
+			if (hashtable->nbatch*sizeof(PGAlignedBlock) < hashtable->spaceAllowed) {
+				ExecHashIncreaseNumBatches(hashtable);
+			} else if (hashtable->spaceUsed/2 >= hashtable->spaceAllowed) {
+				/* Exceeded spaceAllowed by 2x, so we'll save RAM by allowing nbatches to increase */
+				/* I think this branch would be hit almost same as below branch */
+				ExecHashIncreaseNumBatches(hashtable);
+			} else if (hashtable->nbatch*sizeof(PGAlignedBlock) < hashtable->spaceUsed) {
+				/*
+				 * spaceUsed is so large that it will save RAM to allow the BatchFile structures themselves to exceed work_mem.
+				 * Increasing batches and exceeding work mem will use less space and be less bad than not increasing...
+				 */
+				elog(WARNING, "exceeding work mem: used:%ld allowed:%ld add:%ld",
+						hashtable->spaceUsed, hashtable->spaceAllowed, hashtable->nbatch*sizeof(PGAlignedBlock));
+				ExecHashIncreaseNumBatches(hashtable);
+			}
+		}
 	}
 	else
 	{
@@ -2693,6 +2716,8 @@ ExecHashGetInstrumentation(HashInstrumentation *instrument,
 	instrument->nbatch = hashtable->nbatch;
 	instrument->nbatch_original = hashtable->nbatch_original;
 	instrument->space_peak = hashtable->spacePeak;
+	/* Finally, account for size of structure: */
+	instrument->space_peak += hashtable->nbatch * sizeof(PGAlignedBlock);
 }
 
 /*
-- 
2.7.4


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux