Hi,
We've upgraded to PostgreSQL 11.5 (postgresql.org rhel 6 rpm) and I have
had another occurrence of this invalid alloc of 1GB. Apologies for never
providing a query plan when discussing this two years ago; we decided to
move to a newer PostgreSQL to see if the issue went away but took a
while to complete the move.
The invalid alloc still only occurs occasionally every few months on a
query that we run every minute; so unfortunately we still don't have a
contained reproducible test case.
This is the SQL we are running with a query plan - the query plan is
from an new empty database so the planner has no stats.
CREATE OR REPLACE FUNCTION
create_table()
RETURNS TABLE (a INTEGER, b INTEGER)
AS $$
UPDATE y
SET c = true,
d = false
WHERE e IS NOT true
AND f IS NOT true
AND g = 1
AND h = 0
AND i = 0
AND (j IS NULL
OR j > 0)
RETURNING y.a, y.b;
$$ LANGUAGE SQL;
-- Prepared statement (PQprepare)
CREATE TEMPORARY TABLE temp_table AS
SELECT * FROM create_table()
-- Prepared statement (PQprepare)
SELECT y.a, y.b,
x.k,
x.l,
y.m,
y.n,
y.o
FROM temp_table
INNER JOIN y ON temp_table.b = y.b
AND temp_table.a = y.a
INNER JOIN x ON x.a = y.a
-- The two prepared statements are executed one after another
-- in the order shown many times an hour.
The query plan for the second prepared statement is:
Nested Loop (cost=17.14..64.38 rows=16 width=112)
-> Hash Join (cost=17.00..61.47 rows=16 width=80)
Hash Cond: ((temp_table.b = y.b) AND (temp_table.a = y.a))
-> Seq Scan on temp_table (cost=0.00..32.60 rows=2260 width=8)
-> Hash (cost=12.80..12.80 rows=280 width=76)
-> Seq Scan on y (cost=0.00..12.80 rows=280 width=76)
-> Index Scan using x_pkey on x (cost=0.14..0.18 rows=1 width=40)
Index Cond: (a = temp_table.a)
Thanks,
Stefan
On 31/01/2018 21:23, Tomas Vondra wrote:
On 01/31/2018 09:51 PM, Jan Wieck wrote:
On Wed, Jan 31, 2018 at 12:32 PM, Stefan Blanke
<stefan.blanke@xxxxxxxxxxxxxx <mailto:stefan.blanke@xxxxxxxxxxxxxx>> wrote:
>
> I'll bet you it's not that. It's quite unlikely that would fail with
> exactly 1GB request size. It seems much more like a buffer that we keep
> to be power of 2. The question is which one.
I had dismissed corruption before writing in. It's exactly 1GB every
time this has happened - and we can dump the full dataset
periodically without issue.
>> I have my money on a corrupted TOAST entry. Is this happening on
>> trustworthy hardware or beige box with no ECC or RAID?
It's good quality commercial hardware in our colo - no exactly sure
what.
If it is a sporadic issue and you can dump the full dataset, then I just
lost my money (Tomas, you coming to PGConf in Jersey City?).
Unfortunately no, but I'm sure there will be other opportunities to buy
me a beer ;-) Like pgcon, for example.
But then, if this is a plain COPY to stdout ... I am wondering what
could possibly be in that path that wants to allocate 1GB. Or is this
not so plain but rather a COPY ... SELECT ...?
That's what I've been guessing, and why I was asking for a query plan.
regards