On Sun, Apr 14, 2019 at 11:59:45PM -0400, Gunther wrote:
On 4/14/2019 23:24, Tom Lane wrote:
ExecutorState: 2234123384 total in 266261 blocks; 3782328 free (17244 chunks); 2230341056 used
Oooh, that looks like a memory leak right enough. The ExecutorState
should not get that big for any reasonable query.
2.2 GB is massive yes.
Your error and stack trace show a failure in HashBatchContext,
which is probably the last of these four:
HashBatchContext: 57432 total in 3 blocks; 16072 free (6 chunks); 41360 used
HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
HashBatchContext: 100711712 total in 3065 blocks; 7936 free (0 chunks); 100703776 used
Perhaps that's more than it should be, but it's silly to obsess over 100M
when there's a 2.2G problem elsewhere.
Yes.
I think it's likely that it was
just coincidence that the failure happened right there. Unfortunately,
that leaves us with no info about where the actual leak is coming from.
Strange though, that the vmstat tracking never showed that the cache
allocated memory goes much below 6 GB. Even if this 2.2 GB memory leak
is there, and even if I had 2 GB of shared_buffers, I would still have
enough for the OS to give me.
Depends on how the kernel is configured. What are vm.overcommit_memory
and vm.overcommit_ratio set to, for example?
It may easily be the case that the kernel is only allowing 50% of RAM to
be committed to user space, and then refusing to allocate more despite
having free memory. That's fairly common issue on swapless systems.
Try running the query again, watch
cat /proc/meminfo | grep Commit
and if it crashes when Committed_AS hits the CommitLimit.
That doesn't explain where the memory leak is, though :-(
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services