On Mon, Oct 16, 2023 at 2:32 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Craig McIlwee <craigm@xxxxxx> writes:
> Most discussions regarding the lock table say that the size of the lock
> table determines how many locks can be held. The documentation for
> max_locks_per_transaction [3] reads slightly different though, and in
> particular this phrases stands out to me:
>> no more than this many distinct objects can be locked at any one time
> To me, that seems to be saying that multiple locks for the same object
> (e.g. for a single table) would only consume a single lock table entry.
> Finally on to my first question: Am I interpreting the documentation
> correctly, that multiple locks for the same object only consume a single
> lock table entry,
Yes ... however it's a good deal more complicated than that.
What actually happens under the hood is that we allocate enough shared
memory space for (MaxBackends + max_prepared_transactions) *
max_locks_per_transaction LOCK structs (which are the per-locked-object
entries) and twice that many PROCLOCK structs, which are
per-lock-per-holder information. The 2X multiplier assumes that on
average about two sessions will be holding/requesting locks on any
specific locked object.
I think that explains why I could successfully issue the same large statement from 2 concurrent transactions but then encounter an error when trying a third.
Now, MaxBackends is more than max_connections, because it also
accounts for autovacuum workers, parallel workers, etc. So that's
one of the sources of the fuzzy limit you noticed. The other source
is that we allocate about 100K more shared memory space than we think
we need, and it's possible for the lock tables to expand into that
"slop" space. I've not checked the sizes of these structs lately,
but the slop space could surely accommodate several hundred more
locks than the initial estimate allows.
OK, those 2 bits of fuzz look like they could increase the lock table size by at least 1000 entries, maybe even 2000, which explains how I could get ~7500 locks without an error. 100K / 168 = 609 extra slots. Assuming MaxBackends is 10 higher than max_connections (wild guess there, but maybe it's even higher?), then that's another 640 slots.
Even if you know how many lockable objects your installation has (which I bet you don't, or at least the number isn't likely to hold still for long)
Not exactly, but we have time based (monthly) partitioning and can make a reasonable estimate on how many years an installation will run so our guess wouldn't be too far off. My plan was to make an estimate then increase by 50% or maybe even 100% to account for new objects that could be added later.
it's pretty hard to say exactly how many PROCLOCK entries you might need.
Yeah, good point there, but I suppose our connection pool's relatively low max connection limit could be used as an upper bound.
And bloating the lock table size by max_connections/2 or so to try
to brute-force that doesn't seem like a good plan.
I'm not following - where does max_connections/2 come from?
I'd just raise max_prepared_transactions until you stop seeing
problems, and then maybe add a factor of two safety margin.
That's what we've already done for the short term solution. It is somewhat in conflict with your statement regarding the number of lockable objects not holding still for long, though. As time goes on and our scheduled jobs automatically create new monthly partitions, or as our schema evolves, we may eventually hit the limits again. That's why we'd like to create some formula that can estimate the max_locks_per_transaction value we should configure (with the previously mentioned multiplier for safety / future proofing). An alternative would be to precreate all partitions we anticipate needing so we don't get surprises down the line, but then we incur extra planning cost for tables that will stay empty for months or even years.
Craig