Re: ERROR: invalid memory alloc request size 1073741824

Stefan Blanke <stefan.blanke@xxxxxxxxxxxxxx> · Wed, 11 Mar 2020 15:26:09 +0000

Hi,

We've upgraded to PostgreSQL 11.5 (postgresql.org rhel 6 rpm) and I have 
had another occurrence of this invalid alloc of 1GB. Apologies for never 
providing a query plan when discussing this two years ago; we decided to 
move to a newer PostgreSQL to see if the issue went away but took a 
while to complete the move.

The invalid alloc still only occurs occasionally every few months on a 
query that we run every minute; so unfortunately we still don't have a 
contained reproducible test case.

This is the SQL we are running with a query plan - the query plan is 
from an new empty database so the planner has no stats.

CREATE OR REPLACE FUNCTION
create_table()
RETURNS TABLE (a INTEGER, b INTEGER)
AS $$
    UPDATE y
    SET c = true,
        d = false
    WHERE e IS NOT true
        AND f IS NOT true
        AND g = 1
        AND h = 0
        AND i = 0
        AND (j IS NULL
            OR j > 0)
    RETURNING y.a, y.b;
$$ LANGUAGE SQL;

-- Prepared statement (PQprepare)
CREATE TEMPORARY TABLE temp_table AS
SELECT * FROM create_table()

-- Prepared statement (PQprepare)
SELECT y.a, y.b,
     x.k,
     x.l,
     y.m,
     y.n,
     y.o
 FROM temp_table
     INNER JOIN y ON temp_table.b = y.b
         AND temp_table.a = y.a
     INNER JOIN x ON x.a = y.a

-- The two prepared statements are executed one after another
-- in the order shown many times an hour.

The query plan for the second prepared statement is:

 Nested Loop  (cost=17.14..64.38 rows=16 width=112)
   ->  Hash Join  (cost=17.00..61.47 rows=16 width=80)
         Hash Cond: ((temp_table.b = y.b) AND (temp_table.a = y.a))
         ->  Seq Scan on temp_table  (cost=0.00..32.60 rows=2260 width=8)
         ->  Hash  (cost=12.80..12.80 rows=280 width=76)
               ->  Seq Scan on y  (cost=0.00..12.80 rows=280 width=76)
   ->  Index Scan using x_pkey on x  (cost=0.14..0.18 rows=1 width=40)
         Index Cond: (a = temp_table.a)

Thanks,
Stefan

On 31/01/2018 21:23, Tomas Vondra wrote:

On 01/31/2018 09:51 PM, Jan Wieck wrote:

On Wed, Jan 31, 2018 at 12:32 PM, Stefan Blanke
<stefan.blanke@xxxxxxxxxxxxxx <mailto:stefan.blanke@xxxxxxxxxxxxxx>> wrote:

     >
     > I'll bet you it's not that. It's quite unlikely that would fail with
     > exactly 1GB request size. It seems much more like a buffer that we keep
     > to be power of 2. The question is which one.

     I had dismissed corruption before writing in. It's exactly 1GB every
     time this has happened - and we can dump the full dataset
     periodically without issue.

     >> I have my money on a corrupted TOAST entry. Is this happening on
     >> trustworthy hardware or beige box with no ECC or RAID?

     It's good quality commercial hardware in our colo - no exactly sure
     what.

If it is a sporadic issue and you can dump the full dataset, then I just
lost my money (Tomas, you coming to PGConf in Jersey City?).

Unfortunately no, but I'm sure there will be other opportunities to buy
me a beer ;-) Like pgcon, for example.

But then, if this is a plain COPY to stdout ... I am wondering what
could possibly be in that path that wants to allocate 1GB. Or is this
not so plain but rather a COPY ... SELECT ...?

That's what I've been guessing, and why I was asking for a query plan.

regards