Re: SSI and predicate locks - a non-trivial use case

Kevin Grittner <kgrittn@xxxxxxxxx> · Tue, 3 Sep 2013 07:12:54 -0700 (PDT)

Gianni Ceccarelli <dakkar@xxxxxxxxxxxxxxx> wrote:
> On 2013-08-31 Kevin Grittner <kgrittn@xxxxxxxxx> wrote:

>> [Locks without PIDs] are predicate locks related to a
>> transaction which has been PREPARED (for two-phase commit) or
>> committed, but which may still be relevant because there are
>> overlapping read-write transactions which are still active.  One
>> long-running SELECT statement, if it is not declared to be in a
>> read-only transaction, could cause a large number of such locks
>> to accumulate.
>
> So a long "read committed" transaction will cause locks from
> "serialisable" transactions to accumulate? Good to know, I had
> not realised that.

I stated that poorly -- if I remember correctly, long-running
serializable read-write transactions should cause predicate locks
of committed overlapping serializable transactions to retained;
transactions using other isolation levels or which are read-only
should not have this affect.  Predicate locks from a prepared
transaction, however, must be kept at least until commit of the
prepared transaction, at which point they must be kept until
completion of all serializable read-write transactions running at
the moment of commit.

There is one more special case of predicate locks without a pid,
although it seems rather unlikely to be in play here -- if a large
number of committed transactions exhausts the limit on predicate
locks, the locks will be summarized.  A summary predicate lock will
not only have a missing pid but a missing virtualtransaction in
pg_locks.  These should get cleaned up when the last of the
summarized transactions would have been.  Even if you are seeing
these, summarization should make the count of locks without a pid
lower, not higher -- unless there is a bug specific to cleanup of
summarized transactions.

If none of this explains the locks without you are seeing without a
pid, it it possible there is an undiscovered bug in SSI.

>> If you are not already doing so, make sure that any serializable
>> transaction which is not going to modify data is declared to be
>> READ ONLY at the start.
>
> All our serializable transactions modify data. Those that don't,
> don't need to be isolated that much, so we declare them "read
> committed". Should we declare them as "read committed, read
> only"?

If they are not serializable declaring them read-only should not
affect this issue.

This is a digression, but be sure to consider ways in which even a
read-only transaction might see anomalies.  You might not be
vulnerable to such problems, but I just wanted to point out the
possibility:

http://wiki.postgresql.org/wiki/SSI#Read_Only_Transactions

>>> - Is the large number of page locks to be expected?
>>
>> There is probably some database transaction (and that will count
>> individual statements not explicitly included in transactions)
>> which is running for minutes.
>
> The slow transactions should only be "read committed". Or we may
> have some bug in the code. I'll keep looking.

Unless I'm missing something, there is either a long running
serializable read-write transaction on your cluster or a bug in SSI
cleanup.  Be sure to consider transactions in other databases and
ad hoc queries.

>> As you probably noticed, the heuristics in
>> PredicateLockPromotionThreshold() are pretty simple.
>
> Yes, that's the main reason I decided to write to the list.
>
>> and you can always just go really high on
>> max_pred_locks_per_transaction instead.
>
> Could it be useful to document that predicate locks are very
> small (from the source, I'd say around 150 bytes), so that people
> don't get scared to set max_pred_locks_per_transaction very high?

Perhaps.  Increasing the default setting by a factor of 10 with the
default max_connections of 100 would reserve about 8.4 MB of
additional RAM for predicate locks. With heavy use of serializable
transactions, it is common to need to go that far or further, which
surprises many people; and I suspect there may be some trepidation
about the memory impact that the documentation could allay.  Do you
have suggested wording?

>> If you have a workload which you think would do better with
>> something more sophisticated, it would be great to have more
>> details.
>
> I'm no longer sure that our system is interesting, but I'll be
> glad to provide as much detail as I can gather. What kind of
> details would be useful?

When you see the high count of SIRead locks with null PID in
pg_locks, the complete output of pg_stat_activity and pg_locks
would be interesting.  You could attach those as a compressed
tarball, or if there is concern about posting that publicly you
could send it to me.  I think that might be enough to determine
whether there is a bug in predicate lock cleanup.

>> If you wanted to benchmark your workload against a custom
>> version with a modified PredicateLockPromotionThreshold()
>> function, that would be fantastic.
>
> I don't know enough to write such a modified version, but I can
> run it.

At this point I'm not sure how to tweak it to make anything better
for you.  If the rows are narrow and you are getting false positive
serialization failures because of the promotion of heap tuple locks
to page locks, it might be interesting to test a version of the
function which bases the number needed for promotion on the maximum
number of tuples which can fit on a page for that particular
relation.  Perhaps promote to a page lock when the page hits the
point where 50% of the maximum tuples for a page have been locked. 
The question would be whether the performance gain from fewer
transaction retries would outweigh the cost of the more complex
calculation.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general