Re: Out of Memory errors are frustrating as heck!

Tomas Vondra <tomas.vondra@xxxxxxxxxxxxxxx> · Thu, 18 Apr 2019 17:21:28 +0200

On Wed, Apr 17, 2019 at 11:52:44PM -0400, Gunther wrote:
Hi guys. I don't want to be pushy, but I found it strange that after 
so much lively back and forth getting to the bottom of this, suddenly 
my last nights follow-up remained completely without reply. I wonder 
if it even got received. For those who read their emails with modern 
readers (I know I too am from a time where I wrote everything in plain 
text) I marked some important questions in bold.

It was received (and it's visible in the archives). It's right before
easter, so I guess some people may be already on a vaction.

As for the issue - I think the current hypothesis is that the data
distribution is skewed in some strange way, triggering some unexpected
behavior in hash join. That seems plausible, but it's really hard to
investigate without knowing anything about the data distribution :-(

It would be possible to do at least one of these two things:

(a) export pg_stats info about distribution of the join keys

The number of tables involved in the query is not that high, and this
would allo us to generate a data set approximating your data. The one
thing this can't do is showing how it's affected by WHERE conditions.

(b) export data for join keys

This is similar to (a), but it would allow filtering data by the WHERE
conditions first. The amount of data would be higher, although we only
need data from the columns used as join keys.

Of course, if those key values contain sensitive data, it may not be
possible, but perhaps you could hash it in some way.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services