Re: unexpected chunk number 2 (expected 0) for toast value ... in pg_toast_18536

Karsten Hilbert <Karsten.Hilbert@xxxxxxx> · Fri, 20 Mar 2020 19:46:13 +0100

On Mon, Mar 16, 2020 at 12:38:35PM -0700, Andres Freund wrote:

> > We have ruled out (?) below-PG hardware problems by a
> > successful run of:
> >
> > 	cp -rv —preserve=all /var/lib/postgresql/9.6  /tmp/
>
> FWIW, I don't think that rules out hardware problems at all. In plenty
> cases of corruption you can just end up with corrupted on-disk data
> (swapped blocks, zeroed blocks, randomly different values ...).

... hence the (?) ...

> But obviously it is not at all guaranteed that is the case. Could you
> describe the "history" of the database? Replication set up, failovers,
> etc?

No replication, no failovers.

There may have been hard shutdowns as in power failure but
there's no history of that to relate.

> A plpgsql function that returns the rows one-by-one and catches the
> exception is probably your best bet.

We have done that (in Python) for good measure during
recovery procedures.

> It could roughly look something like:
>
> CREATE OR REPLACE FUNCTION salvage(p_tblname regclass)
> RETURNS SETOF text
> LANGUAGE plpgsql AS
> $$
> DECLARE
>     v_row record;
> BEGIN
>     FOR v_row IN EXECUTE 'SELECT * FROM '||p_tblname::text LOOP
>        BEGIN
>            -- this forces detoasting
>            RETURN NEXT v_row::text;
>        EXCEPTION WHEN internal_error OR data_corrupted OR index_corrupted THEN
>            -- add pkey or something else
>            RAISE NOTICE 'failed to return data';
>        END;
>     END LOOP;
> END
> $$
>
> should work. You can call it like
> SELECT (salvaged_rec.rec).* FROM (SELECT salvaged_text::salvage_me FROM salvage('salvage_me') AS salvaged_text) AS salvaged_rec(rec)

Thanks for taking the time. Would something like this be a
useful addition to the adminpack extension ?

Best,
Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B