Scott Frazer wrote: > Hi, we have a Postgres 9.6 setup using replication that has recently started seeing a lot of processes stuck in > "SubtransControlLock" as a wait_event on the read-replicas. Like this, only usually about 300-800 of them: > > > 179706 | LWLockNamed | SubtransControlLock > 186602 | LWLockNamed | SubtransControlLock > 186606 | LWLockNamed | SubtransControlLock > 180947 | LWLockNamed | SubtransControlLock > 186621 | LWLockNamed | SubtransControlLock > > The server then begins to crawl, with some queries just never finishing until I finally shut the server down. > > Searching for that particular combo of wait_event_type and wait_event only seems to turn up the page > about statistics collection, but no helpful information on troubleshooting this lock. > > Restarting the replica server clears the locks and allows us to start working again, but it's happened > twice now in 12 hours and I'm worried it will happen again. > > Does anyone have any advice on where to start looking? I don't think there is any connection to statistics collection. This lock is used when subtransactions (SAVEPOINTs in SQL or EXCEPTION blocks in PL/pgSQL) are created, read or removed. This sounds like a PostgreSQL bug. What is the exact PostgreSQL version you are running? It would be cool if you could get a stack trace from the backend that is holding the lock. Yours, Laurenz Albe