Thanks Tom, this is intriguing. I've changed our backups to do pg_dump with verbose, and if I see this issue again I'll dig a bit with the additional information.
On Tue, Aug 25, 2020 at 4:24 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
I wrote:
> Wells Oliver <wells.oliver@xxxxxxxxx> writes:
>> It doesn't exist any longer, which lead me to try to think of things that
>> might be dropped during the dump process.
> Hm, if you're actually *dropping* matviews during the dump then it's
> not so hard to explain this error. They'd have to be ones that were
> selected to be dumped though.
I experimented a bit to try to reproduce this problem. I cannot get
any sort of error from REFRESH (with or without CONCURRENTLY) in
parallel with a pg_dump. If I drop a view or matview, I can easily
get an error, but I've not managed to reproduce one that looks like
yours; it tends to be more like
pg_dump: error: query failed: ERROR: could not open relation with OID 45698
What I found that *would* reproduce "cache lookup failed for attribute"
from pg_get_indexdef() is to explicitly drop a matview's index just
before pg_dump gets to it. So I wonder if you are doing that in your
"refresh" procedure. The timing is not terribly tight; the drop has to
happen between where pg_dump acquires its transaction snapshot and where
it tries to investigate the matview's indexes, which could be some while
in a database with many objects. Also, if the transaction doing the index
drop also takes out any exclusive locks on regular tables, that could make
it much easier to send pg_dump down this rabbit hole, since it'd block
on those locks till the damage was done.
regards, tom lane
Wells Oliver
wells.oliver@xxxxxxxxx
wells.oliver@xxxxxxxxx