Re: Slave server sometimes locks up

Thomas Munro <thomas.munro@xxxxxxxxx> · Wed, 6 Mar 2019 05:20:07 +1300

On Wed, Mar 6, 2019 at 1:39 AM Boris Sagadin <boris@xxxxxxxxxxxxx> wrote:
> PgSQL 10.7, Ubuntu 16.04 LTS
>
> Symptoms:
>
> - server accepts new queries until connections exhausted (all queries are SELECT)
> - queries are active, never end, but no disk IO
> - queries can't be killed with kill -TERM or pg_terminate_backend()
> - system load is minimal (vmstat shows 100% idle)
> - perf top shows nothing
> - statement_timeout is ignored
> - no locks with SELECT relation::regclass, * FROM pg_locks WHERE NOT GRANTED;
> - server exits only on kill -9
> - strace on SELECT process indefinitely shows:
>
> futex(0x7f00fe94c938, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff^Cstrace: Process 121319 detached
>  <detached ...>
>
> GDB backtrace:
>
> (gdb) bt
> #0  0x00007f05256f1827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0, futex_word=0x7f00fe94ba38) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
> #1  do_futex_wait (sem=sem@entry=0x7f00fe94ba38, abstime=0x0) at sem_waitcommon.c:111
> #2  0x00007f05256f18d4 in __new_sem_wait_slow (sem=0x7f00fe94ba38, abstime=0x0) at sem_waitcommon.c:181
> #3  0x00007f05256f197a in __new_sem_wait (sem=sem@entry=0x7f00fe94ba38) at sem_wait.c:29
> #4  0x000055c9b95eb792 in PGSemaphoreLock (sema=0x7f00fe94ba38) at pg_sema.c:316
> #5  0x000055c9b965eaec in LWLockAcquire (lock=0x7f00fe96f880, mode=mode@entry=LW_EXCLUSIVE) at /build/postgresql-10-BKASGd/postgresql-10-10.7/build/../src/backend/storage/lmgr/lwlock.c:1233
> #6  0x000055c9b96497f7 in dsm_create (size=size@entry=105544, flags=flags@entry=1) at /build/postgresql-10-BKASGd/postgresql-10-10.7/build/../src/backend/storage/ipc/dsm.c:493
> #7  0x000055c9b94139ff in InitializeParallelDSM (pcxt=pcxt@entry=0x55c9bb8d9d58) at /build/postgresql-10-BKASGd/postgresql-10-10.7/build/../src/backend/access/transam/parallel.c:268

Hello Boris,

This looks like a known symptom of a pair of bugs we recently tracked
down and fixed:

1. "dsa_area could not attach to segment": dsm.c, fixed in commit 6c0fb941.
2. "cannot unpin a segment that is not pinned": dsm.c, fixed in commit 0b55aaac.

Do you see one of those messages earlier in your logs?  The bug was
caused by a failure to allow for a corner case where a new shared
memory segment has the same ID as a recently/concurrently destroyed
one, and is made more likely to occur by not-very-random random
numbers.  If one of these errors occurs while cleaning up from a
parallel query, a backend can self-deadlock while trying to cleanup
the same thing again in the error-handling path, and then other
backends will later block on that lock if they try to run a parallel
query.

The fix will be included in the next set of releases, but in the
meantime you could consider turning off parallel query (set
max_parallel_workers_per_gather = 0).  In practice I think you could
also avoid this problem by loading a library that calls something like
srandom(getpid()) in _PG_init() (so it runs in every parallel worker
making ID collisions extremely unlikely), but that's not really a
serious recommendation since it requires writing C code.

-- 
Thomas Munro
https://enterprisedb.com