Re: startup process stuck in recovery

Tom Lane <tgl@xxxxxxxxxxxxx> · Tue, 10 Oct 2017 16:23:52 -0400

Christophe Pettus <xof@xxxxxxxxxxxx> writes:
> I was able to reproduce this on 9.5.9 with the following:

Hmm ... so I still can't reproduce the specific symptoms Christophe
reports.

What I see is that, given this particular test case, the backend
process on the master never holds more than a few locks at a time.
Each time we abort a subtransaction, the AE lock it was holding
on the temp table it created gets dropped.  However ... on the
standby server, pre v10, the replay process attempts to take all
12000 of those AE locks at once.  This is not a great plan.

On 9.5, for me, as soon as we're out of shared memory
ResolveRecoveryConflictWithLock will go into an infinite loop.
And AFAICS it *is* infinite; it doesn't look to me like it's
making any progress.  This is pretty easy to diagnose though
because it spews "out of shared memory" WARNING messages to the
postmaster log at an astonishing rate.

9.6 hits the OOM condition as well, but manages to get out of it
somehow.  I'm not very clear how, and the log trace doesn't look
like it's real clean: after a bunch of these

WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C1098AC0 for Standby/LOCK: xid 134024 db 423347 rel 524106 
WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C10A97E0 for Standby/LOCK: xid 134024 db 423347 rel 524151 
WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C10B36B0 for Standby/LOCK: xid 134024 db 423347 rel 524181 
WARNING:  out of shared memory
CONTEXT:  xlog redo at 0/C10BD780 for Standby/LOCK: xid 134024 db 423347 rel 524211 

you get a bunch of these

WARNING:  you don't own a lock of type AccessExclusiveLock
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
LOG:  RecoveryLockList contains entry for lock no longer recorded by lock manager: xid 134024 database 423347 relation 526185
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
WARNING:  you don't own a lock of type AccessExclusiveLock
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
LOG:  RecoveryLockList contains entry for lock no longer recorded by lock manager: xid 134024 database 423347 relation 526188
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
WARNING:  you don't own a lock of type AccessExclusiveLock
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
LOG:  RecoveryLockList contains entry for lock no longer recorded by lock manager: xid 134024 database 423347 relation 526191
CONTEXT:  xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04

The important point though is that "a bunch" is a finite number,
whereas 9.5 seems to be just stuck.  I'm not sure how Christophe's
server managed to continue to make progress.

It looks like the 9.6-era patch 37c54863c must have been responsible
for that behavioral change.  There's no indication in the commit message
or the comments that anyone had specifically considered the OOM
scenario, so I think it's just accidental that it's better.

v10 and HEAD avoid the problem because the standby server doesn't
take locks (any at all, AFAICS).  I suppose this must be a
consequence of commit 9b013dc238c, though I'm not sure exactly how.

Anyway, it's pretty scary that it's so easy to run the replay process
out of shared memory pre-v10.  I wonder if we should consider
backpatching that fix.  Any situation where the replay process takes
more locks concurrently than were ever held on the master is surely
very bad news.

A marginally lesser concern is that the replay process does need to have
robust behavior in the face of locktable OOM.  AFAICS whatever it is doing
now is just accidental, and I'm not sure it's correct.  "Doesn't get into
an infinite loop" is not a sufficiently high bar.

And I'm still wondering exactly what Christophe actually saw ...

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general