On 10 October 2017 at 21:23, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > What I see is that, given this particular test case, the backend > process on the master never holds more than a few locks at a time. > Each time we abort a subtransaction, the AE lock it was holding > on the temp table it created gets dropped. However ... on the > standby server, pre v10, the replay process attempts to take all > 12000 of those AE locks at once. This is not a great plan. Standby doesn't take locks "at once", they are added just as they arrive. The locks are held by topxid, so not released at subxid abort, by design, so they are held concurrently. > v10 and HEAD avoid the problem because the standby server doesn't > take locks (any at all, AFAICS). I suppose this must be a > consequence of commit 9b013dc238c, though I'm not sure exactly how. Locks are still taken, but in 9b013dc238c we just avoid trying to release locks when transactions don't have any. > Anyway, it's pretty scary that it's so easy to run the replay process > out of shared memory pre-v10. I wonder if we should consider > backpatching that fix. Any situation where the replay process takes > more locks concurrently than were ever held on the master is surely > very bad news. v10 improves on this specific point because we perform lock release at subxid abort. Various cases have been reported over time and this has been improving steadily in each release. It isn't "easy" to run the replay process out of memory because clearly that doesn't happen much, but yes there are some pessimal use cases that don't work well. The use case described seems incredibly unreal and certainly amenable to being rewritten. Backpatching some of those fixes is quite risky, IMHO. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general