On 09/28/2011 08:54 PM, Maria L. Wilson wrote:
UPDATE dataset
SET gracount = gracount + 1
WHERE dataset.inv_id = NEW.inv_id;
That'll serialize all transactions that touch the same inv_id, so only
one may run at once. The second and subsequent ones will block waiting
for an update lock on the `dataset' tuple for `inv_id'.
When you think about it that's necessary to prevent a wrong result when
transaction A then B run this statement, transaction B commits, and
transaction A rolls back. What's the correct answer?
To fix this you'll need to change how you maintain your `dataset' table.
Exactly how depends on your requirements.
You can trade read performance off against write performance by
INSERTing new rows instead of UPDATEing them, so you do a:
SELECT count(inv_id) FROM dataset WHERE dataset.inv_id = ?
instead of a
SELECT gracount FROM dataset WHERE dataset.inv_id = ?
to retrieve your count.
You can have both fast reads and fast writes if you accept potentially
out-of-date `gracount' data, maintaining `gracount' as a materialized
view that you refresh using LISTEN/NOTIFY *after* a transaction commits.
It's possible for it to be a little out of date, but writers no longer
interfere with each other and readers no longer have to re-do the
counting/aggregation work.
You can live with serializing writes like you currently do in exchange
for the greater read performance of maintaining counts. This may not be
so bad once you understand what's happening and can keep transactions
that trigger this problem short, preventing them from blocking others
while they do unrelated work.
In the end, this looks like a simple problem but it isn't when you
consider the possibility of transactions rolling back.
Our first try to solve this problem has been to convert these triggers
into a constraint trigger which allows for DEFERRABLE INITIALLY DEFERRED
flags. This, we are finding, is forcing the trigger function to run
after the triggering transaction is completed. We believe this will fix
our locking problem and hopefully speed up our inserts again.
That should help, but it's a form of trading timeliness off against
performance. Queries within the same transaction won't see the updated
`dataset' values, so if you're relying on them for correct operation
later in the transaction you could have issues.
--
Craig Ringer
--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin