On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote:
Jeff Janes <jeff.janes@xxxxxxxxx> writes:
The growEnabled stuff only prevents infinite loops. It doesn't prevent
extreme silliness.
If a single 32 bit hash value has enough tuples by itself to not fit in
work_mem, then it will keep splitting until that value is in a batch by
itself before shutting off
Right, that's the code's intention. If that's not good enough for this
case, we'll need to understand the details a bit better before we can
design a better(?) heuristic.
I think we only disable growing when there are no other values in the
batch, but that seems rather easy to defeat - all you need is a single
tuple with a hash that falls into the same batch, and it's over. Maybe
we should make this a bit less accurate - say, if less than 5% memory
gets freed, don't add more batches.
I suspect, however, that we might be better off just taking the existence
of the I/O buffers into account somehow while deciding whether it's worth
growing further. That is, I'm imagining adding a second independent
reason for shutting off growEnabled, along the lines of "increasing
nbatch any further will require an unreasonable amount of buffer memory".
The question then becomes how to define "unreasonable".
I think the question the code needs to be asking is "If we double the
number of batches, does the amount of memory we need drop?" And the
memory needs to account both for the buffers and per-batch data.
I don't think we can just stop increasing the number of batches when the
memory for BufFile exceeds work_mem, because that entirely ignores the
fact that by doing that we force the system to keep the per-batch stuff
in memory (and that can be almost arbitrary amount).
What I think we should be doing instead is instead make the threshold
dynamic - instead of just checking at work_mem, we need to increment the
number of batches when the total amount of memory exceeds
Max(work_mem, 3 * memory_for_buffiles)
This is based on the observation that by increasing the number of
batches, we double memory_for_buffiles and split the per-batch data in
half. By adding more batches, we'd actually increase the amount of
memory used.
Of course, this just stops enforcing work_mem at some point, but it at
least attempts to minimize the amount of memory used.
An alternative would be spilling the extra tuples into a special
overflow file, as I explained earlier. That would actually enforce
work_mem I think.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services