On Wed, Nov 22, 2017 at 01:27:12PM -0500, Tom Lane wrote: > Justin Pryzby <pryzby@xxxxxxxxxxxxx> writes: > > On Tue, Nov 21, 2017 at 03:40:27PM -0800, Andres Freund wrote: > >> Could you try stracing next time? > > > I straced all the "startup" PIDs, which were all in futex, without exception: > > If you've got debug symbols installed, could you investigate the states > of the LWLocks the processes are stuck on? > > My hypothesis about a missed memory barrier would imply that there's (at > least) one process that's waiting but is not in the lock's wait queue and > has MyProc->lwWaiting == false, while the rest are in the wait queue and > have MyProc->lwWaiting == true. Actually chasing through the list > pointers would be slightly tedious, but checking MyProc->lwWaiting, > and maybe MyProc->lwWaitMode, in each process shouldn't be too hard. > Also verify that they're all waiting for the same LWLock (by address). I believe my ~40 cores are actually for backends from two separate instances of this issue on the VM, as evidenced by different argv pointers. And for each instance, I have cores for only a fraction of the backends (max_connections=400). For starters, I found that PID 27427 has: (gdb) p proc->lwWaiting $1 = 0 '\000' (gdb) p proc->lwWaitMode $2 = 1 '\001' ..where all the others have lwWaiting=1 For #27427: (gdb) p *lock $27 = {tranche = 59, state = {value = 1627389952}, waiters = {head = 147, tail = 308}} (gdb) info locals mustwait = 1 '\001' proc = 0x7f1a77dba500 result = 1 '\001' extraWaits = 0 __func__ = "LWLockAcquire" And at this point I have to ask for help how to finish traversing these structures. I could upload cores for someone (I don't think there's anything too private) but so far I have 16GB gz compressed cores. Note: I've compiled locally PG 10.1 with PREFERRED_SEMAPHORES=SYSV to keep the service up (and to the degree that serves to verify that avoids the issue, great). But I could start an instance running pgbench to try to trigger on this VM, with smaller shared_buffers and backends/clients to allow full cores of every backend (I don't think I'll be able to dump all 400 cores each up to 2GB from the production instance). Would you suggest how I can maximize the likelyhood/speed of triggering that ? Five years ago, with a report of similar symptoms, you said "You need to hack pgbench to suppress the single initialization connection it normally likes to make, else the test degenerates to the one-incoming-connection case" https://www.postgresql.org/message-id/8896.1337998337%40sss.pgh.pa.us ..but, pgbench --connect seems to do what's needed(?) (I see that dates back to 2001, having been added at ba708ea3). (I don't know there's any suggestion or reason to be believe the bug is specific to connection/startup phase, or that it's a necessary or sufficient to hit the bug, but it's at least known to be impacted and all I have to go on for now). Justin