Re: 12.3 replicas falling over during WAL redo

Kyotaro Horiguchi <horikyota.ntt@xxxxxxxxx> · Mon, 03 Aug 2020 13:39:42 +0900 (JST)

At Sat, 1 Aug 2020 09:58:05 -0700, Ben Chobot <bench@xxxxxxxxxxxxxxx> wrote in 
> 
> 
> Alvaro Herrera wrote on 8/1/20 9:35 AM:
> > On 2020-Aug-01, Ben Chobot wrote:
> >
> >> We have a few hundred postgres servers in AWS EC2, all of which do
> >> streaming
> >> replication to at least two replicas. As we've transitioned our fleet
> >> to
> >> from 9.5 to 12.3, we've noticed an alarming increase in the frequency
> >> of a
> >> streaming replica dying during replay. Postgres will log something
> >> like:
> >>
> >> |2020-07-31T16:55:22.602488+00:00 hostA postgres[31875]: [19137-1]
> >> |db=,user=
> >> LOG: restartpoint starting: time 2020-07-31T16:55:24.637150+00:00
> >> hostA
> >> postgres[24076]: [15754-1] db=,user= FATAL: incorrect index offsets
> >> supplied
> >> 2020-07-31T16:55:24.637261+00:00 hostA postgres[24076]: [15754-2]
> >> db=,user=
> >> CONTEXT: WAL redo at BCC/CB7AF8B0 for Btree/VACUUM: lastBlockVacuumed
> >> 1720
> >> 2020-07-31T16:55:24.642877+00:00 hostA postgres[24074]: [8-1]
> >> db=,user= LOG:
> >> startup process (PID 24076) exited with exit code 1|
> > I've never seen this one.
> >
> > Can you find out what the index is being modified by those LSNs -- is
> > it
> > always the same index?  Can you have a look at nearby WAL records that
> > touch the same page of the same index in each case?
> >
> > One possibility is that the storage forgot a previous write.
> 
> I'd be happy to, if you tell me how. :)
> 
> We're using xfs for our postgres filesystem, on ubuntu bionic. Of
> course it's always possible there's something wrong in the filesystem
> or the EBS layer, but that is one thing we have not changed in the
> migration from 9.5 to 12.3.

All of the cited log lines seem suggesting relation with deleted btree
page items. As a possibility I can guess, that can happen if the pages
were flushed out during a vacuum after the last checkpoint and
full-page-writes didn't restored the page to the state before the
index-item deletion happened(that is, if full_page_writes were set to
off.). (If it found to be the cause, I'm not sure why that didn't
happen on 9.5.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center