Hi Tom,
This was very helpful, thanks for the tips.
Gautam
On Thu, Jan 13, 2022 at 3:36 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Gautam Bellary <gautam@xxxxxxxxxxxxxxxx> writes:
> I've got a PL/pgSQL function (regenerate_gateway_last_seen, attached) that
> loops through all partitions of 2 tables ("Measure" and "ValuelessMeasure",
> schema attached) selecting data from each into another table
> ("GatewayLastSeenNew"). Occasionally the function runs to completion, but
> usually it errors with the message copied below. I'd appreciate any advice
> to help understand why I'm seeing this and if increasing
> max_locks_per_transaction, changing another configuration value, or
> changing how the function works would improve reliability.
Yes, you ought to raise max_locks_per_transaction ...
> 2. max_locks_per_transaction is being hit in the transaction - this also
> seems unlikely because max_locks_per_transaction is set to the default
> value of 64, but there are ~8000 partitions to consider and I expect the
> resulting GatewayLastSeenNew table to have thousands of rows.
... or else reduce the number of partitions you're using. (People
frequently seem to think that more partitions are free. That is
extremely not true. I generally think that if you're using more
than a few dozen partitions per table, you are making a mistake.)
> If I was
> taking locks that would contribute towards that ceiling of 64 I'd expect
> this to fail every time, instead of failing often but not always as I
> observe.
You misinterpret what that parameter does: it is not a hard per-session
limit. This error appears when the shared lock-table pool overflows, so
you can (usually) take a lot more than 64 locks before running into
trouble. It matters what else is going on in the database.
> 3. The max size of the lock table is being exceeded - my understanding is
> that the lock table has room for max_locks_per_transaction *
> (max_connections + max_prepared_transactions) locks, which would be 64 *
> (200 + 0) = 12800 for my current configuration. I used 'SELECT COUNT(*)
> FROM PG_LOCKS' while the function was running and observe values as high as
> 21758, so if this is the issue it seems like I might not be estimating the
> max size of the lock table correctly or only specific locks contribute to
> that.
I don't recall for sure, but I think that the lock table has one entry per
lockable object, while the pg_locks view shows separate entries for
different lock modes on the same object.
regards, tom lane