Re: Weird planner issue on a standby

Peter Geoghegan <pg@xxxxxxx> · Wed, 12 Oct 2022 09:12:42 -0700

On Wed, Oct 12, 2022 at 6:47 AM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> However, that doesn't explain the downthread report that a
> VACUUM on the primary fixed it.  What I suspect is that that
> caused some in-fact-dead index entries to get cleaned out.

Seems likely.

> But ... if the primary is allowed to vacuum away an index
> entry that it thinks is dead, exactly what is the point of
> making standbys ignore LP_DEAD bits?  There's no additional
> interlock that guarantees the tuple will be there at all.

The interlock doesn't really protect the leaf page or its index tuples
so much as the referenced TIDs themselves. In other words it's a TID
recycling interlock.

That's why we don't need a cleanup lock to perform index tuple
deletions, even though the WAL records for those are almost identical
to the WAL records used by index vacuuming (in the case of nbtree the
only difference is the extra latestRemovedXid field in the deletion
variant WAL record). We know that there is no VACUUM process involved,
and no question of heap vacuuming going ahead for the same TIDs once
index vacuuming is allowed to complete.

We can get away with not having the interlock at all in the case of
nbtree index scans with MVCC snapshots -- but *not* with index-only
scans. See "Making concurrent TID recycling safe" in the nbtree
README. I only got around to documenting all of the details here quite
recently. The index-only scan thing dates back to 9.5.

-- 
Peter Geoghegan