On Wed, Oct 12, 2022 at 6:47 AM Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > However, that doesn't explain the downthread report that a > VACUUM on the primary fixed it. What I suspect is that that > caused some in-fact-dead index entries to get cleaned out. Seems likely. > But ... if the primary is allowed to vacuum away an index > entry that it thinks is dead, exactly what is the point of > making standbys ignore LP_DEAD bits? There's no additional > interlock that guarantees the tuple will be there at all. The interlock doesn't really protect the leaf page or its index tuples so much as the referenced TIDs themselves. In other words it's a TID recycling interlock. That's why we don't need a cleanup lock to perform index tuple deletions, even though the WAL records for those are almost identical to the WAL records used by index vacuuming (in the case of nbtree the only difference is the extra latestRemovedXid field in the deletion variant WAL record). We know that there is no VACUUM process involved, and no question of heap vacuuming going ahead for the same TIDs once index vacuuming is allowed to complete. We can get away with not having the interlock at all in the case of nbtree index scans with MVCC snapshots -- but *not* with index-only scans. See "Making concurrent TID recycling safe" in the nbtree README. I only got around to documenting all of the details here quite recently. The index-only scan thing dates back to 9.5. -- Peter Geoghegan