Re: 9.1/9.2 SERIALIZABLE: expected serialization failure between INSERT and SELECT not encountered

"Kevin Grittner" <kgrittn@xxxxxxxx> · Thu, 18 Oct 2012 08:26:32 -0400

Chris Angelico wrote:
> Craig Ringer <ringerc@xxxxxxxxxxxxx> wrote:
>> BTW, the issue with the underlying question is that their "name"
>> column is unique. They expected to get a serialization failure on
>> duplicate insert into "name", not a unique constraint violation.
>> The question wasn't "why doesn't this fail" but "Why does this
>> fail with a different error than I expected".

I remember reading a paper about an attempt to use software to do a
static analysis of software accessing production databases, to
identify where additional protections (ecplicit locking, conflict
promotion, or conflict materialization) were needed to prevent
serialization anomalies under snapshot isolation. They initially got
a lot of hits for situations where no anomaly could actually happen
due to declarative constraints. Primary keys, unique indexes, and
foreign keys could all prevent anomalies in situations where you
would see them if the logic were left to, for example, trigger code
instead of the declarative constraints. The researchers argued that
in such situations, there was no point generating extra overhead to
use other techniques to redundantly protect data integrity.  I was
pursuaded. (I tried to find the paper to reference it here, but
wasn't successful -- I know that Alan Fekete was one of the authors
and the software they were looking at was in production use by the
Indian Institute of Technology for accounting and also a system for
tracking academic information.)

> Sounds to me like it's giving a better error anyway - more helpful
> to know _why_ the second transaction failed than to simply know
> _that_ it failed.

It is a double-edged sword -- you have a more efficient way to
protect the data and a more specific error message; *but*, you don't
have a SQLSTATE on the error message which makes it clear that the
error was due to a race condition and that it is reasonable to retry
the transaction. The application programmer can still find techniques
which will allow automated retry without bugging the user with
spurious error messages which are effectively about hitting a race
condition from which the software can automatically recover, but this
does undercut the mantra that the transaction will do the right thing
or generate a serialization failure.

As an example, if you want to let serialization failures deal with
automatic retries, without pestering the user about having hit a
recoverable race condition, you stay away from the SELECT max(id) + 1
style of key assignment in favor of sequences (if contiguous numbers
are not critical) or updating a "last_used" number in a table and
using the result (if it is *is* critical that there are no gaps in
the numbers). That is actually the only "special rule" I've found to
be needed in practice so far to otherwise allow programmers to code
each transaction as though it were going to be run alone, and
otherwise ignore concurrency issue when using serializable
transactions.

-Kevin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general