Interestingly enough, I hopped on the database system this morning and found the `datfrozenxid` dropped back down below `autovacuum_freeze_max_age` around 0200 local time (roughly 18 hours after the fact). Looking through the Postgresql logs I don't see anything standing out at that time. I still plan on patching to 10.17 tonight. Matt On Wed, Sep 1, 2021 at 4:01 PM Matthew Tice <mjtice@xxxxxxxxx> wrote: > > Hi Alvaro, thanks for the quick reply. > > I'm scheduled to do my patching maintenance at the end of this month - > but at this point I don't think I'm going to make it. > > Other than patching, is there a work around? For example, in #2 above: > >The fix for 2) is simpler, > > simply always remove both the shared and local init files. > > I'm not familiar with the differences between 'shared' and 'local' > init files (I'd imagine I referenced a 'local' file in my original > post)? > > > > Thanks! > > Matt > > On Wed, Sep 1, 2021 at 3:00 PM Alvaro Herrera <alvherre@xxxxxxxxxxxxxx> wrote: > > > > On 2021-Sep-01, Matthew Tice wrote: > > > > [ problem table is pg_database ] > > > > > My primary, read/write database is Postgresql 10.4 (CentOS 7) while my > > > standby databases have been patched to 10.17. > > > > Hmm, I think there was a bug in the early 10.x versions where advancing > > the xid age of shared tables would not work correctly for some reason ... > > Ah yes, this was fixed in 10.5, a mere three years ago: > > > > Author: Andres Freund <andres@xxxxxxxxxxx> > > Branch: master Release: REL_11_BR [a54e1f158] 2018-06-12 11:13:21 -0700 > > Branch: REL_10_STABLE Release: REL_10_5 [2ce64caaf] 2018-06-12 11:13:21 -0700 > > Branch: REL9_6_STABLE Release: REL9_6_10 [6a46aba1c] 2018-06-12 11:13:21 -0700 > > Branch: REL9_5_STABLE Release: REL9_5_14 [14b3ec6f3] 2018-06-12 11:13:21 -0700 > > Branch: REL9_4_STABLE Release: REL9_4_19 [817f9f9a8] 2018-06-12 11:13:22 -0700 > > Branch: REL9_3_STABLE Release: REL9_3_24 [9b9b622b2] 2018-06-12 11:13:22 -0700 > > > > Fix bugs in vacuum of shared rels, by keeping their relcache entries current. > > > > When vacuum processes a relation it uses the corresponding relcache > > entry's relfrozenxid / relminmxid as a cutoff for when to remove > > tuples etc. Unfortunately for nailed relations (i.e. critical system > > catalogs) bugs could frequently lead to the corresponding relcache > > entry being stale. > > > > This set of bugs could cause actual data corruption as vacuum would > > potentially not remove the correct row versions, potentially reviving > > them at a later point. After 699bf7d05c some corruptions in this vein > > were prevented, but the additional error checks could also trigger > > spuriously. Examples of such errors are: > > ERROR: found xmin ... from before relfrozenxid ... > > and > > ERROR: found multixact ... from before relminmxid ... > > To be caused by this bug the errors have to occur on system catalog > > tables. > > > > The two bugs are: > > > > 1) Invalidations for nailed relations were ignored, based on the > > theory that the relcache entry for such tables doesn't > > change. Which is largely true, except for fields like relfrozenxid > > etc. This means that changes to relations vacuumed in other > > sessions weren't picked up by already existing sessions. Luckily > > autovacuum doesn't have particularly longrunning sessions. > > > > 2) For shared *and* nailed relations, the shared relcache init file > > was never invalidated while running. That means that for such > > tables (e.g. pg_authid, pg_database) it's not just already existing > > sessions that are affected, but even new connections are as well. > > That explains why the reports usually were about pg_authid et. al. > > > > To fix 1), revalidate the rd_rel portion of a relcache entry when > > invalid. This implies a bit of extra complexity to deal with > > bootstrapping, but it's not too bad. The fix for 2) is simpler, > > simply always remove both the shared and local init files. > > > > Author: Andres Freund > > Reviewed-By: Alvaro Herrera > > Discussion: > > https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@xxxxxxxxxxxxxxxxx > > https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@xxxxxxxxxxxxxx > > https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@xxxxxxxxxxxxxx > > https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@xxxxxxxxxxxxxx > > Backpatch: 9.3- > > > > > > -- > > Álvaro Herrera 39°49'30"S 73°17'W > > "El número de instalaciones de UNIX se ha elevado a 10, > > y se espera que este número aumente" (UPM, 1972)