On Sun, Apr 21, 2019 at 07:25:15PM -0400, Gunther wrote:
After applying Tomas' corrected patch 0001, and routing HJDEBUG messages
to stderr:
integrator=# set enable_nestloop to off;
SET
integrator=# explain analyze select * from reports.v_BusinessOperation;
...
WARNING: ExecHashIncreaseNumBatches: increasing number of batches from 131072 to 262144
WARNING: ExecHashIncreaseNumBatches: increasing number of batches from 262144 to 524288
ERROR: out of memory
DETAIL: Failed on request of size 32800 in memory context "HashBatchContext".
Now
TopMemoryContext: 4347672 total in 9 blocks; 41688 free (18 chunks); 4305984 used
...
Portal hash: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
PortalHoldContext: 24632 total in 2 blocks; 7392 free (0 chunks); 17240 used
PortalContext: 1482752 total in 184 blocks; 11216 free (8 chunks); 1471536 used:
ExecutorState: 2449896 total in 16 blocks; 1795000 free (3158 chunks); 654896 used
TupleSort main: 286912 total in 8 blocks; 246792 free (39 chunks); 40120 used
...
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
HashBatchFiles: 2242545904 total in 266270 blocks; 3996232 free (14164 chunks); 2238549672 used
HashTableContext: 8192 total in 1 blocks; 7624 free (5 chunks); 568 used
HashBatchContext: 168165080 total in 5118 blocks; 7936 free (0 chunks); 168157144 used
TupleSort main: 452880 total in 8 blocks; 126248 free (27 chunks); 326632 used
Caller tuples: 1048576 total in 8 blocks; 21608 free (14 chunks); 1026968 used
...
Grand total: 2424300520 bytes in 271910 blocks; 7332360 free (17596 chunks); 2416968160 used
IMO this pretty much proves that the memory allocated for BufFile really
is the root cause of the issues with this query. 524288 batches means
up to 1048576 BufFiles, which is a bit more than ~8GB of RAM. However
those for the inner relation were not allycated yet, so at this point
only about 4GB might be allocated. And it seems ~1/2 of them did not
receive any tuples, so only about 2GB got allocated so far.
The second batch will probably make it fail much sooner, because it
allocates the BufFile stuff eagerly (both for inner and outer side).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services