Re: Deferred constraint trigger semantics

Bryn Llewellyn <bryn@xxxxxxxxxxxx> · Thu, 12 May 2022 22:06:21 -0700

laurenz.albe@xxxxxxxxxxx wrote:

bryn@xxxxxxxxxxxx wrote:

…I have always understood that (in Postgres and any respectable RDBMS) commits in a multi-session environment are always strictly serialized—irrespective of the transaction's isolation level. Am I correct to assume this is the case for Postgres? I took "at COMMIT time" to mean "as part of the strictly serialized operations that implement a session's COMMIT".

I am not sure what you mean by serialized commits. Transactions are concurrent, and so are commits. COMMIT takes some time, during which several things happen, among them executing deferred constraints, writing a WAL record and flushing the WAL. The only thing that is necessarily serialized is writing the WAL record.

Oh. I was wrong, then. I'll say more on this below.

…I take what you say in your post to mean that each session executes its deferred constraint check (by extension, not just for constraint triggers but for all deferred constraint cases) momentarily *before* COMMIT so that the effect is only to reduce the duration of the race condition window rather than to eliminate it.

In the case of constraint triggers, yes. But there is no race condition for primary key, unique and foreign key constraints, because they also "see" uncommitted data.

I can't follow you here, sorry. I tried this:

create table t(
  k serial primary key,
  v int not null,
  constraint t_v_unq unique(v) initially deferred);

-- RED
start transaction isolation level read committed;
insert into t(v) values (1), (2);
select k, v from t order by k;

-- BLUE
start transaction isolation level read committed;
insert into t(v) values (1), (3);
select k, v from t order by k;

-- RED
commit;
select k, v from t order by k;

-- BLUE
select k, v from t order by k;
commit;
select k, v from t order by k;

The first "select" from the "BLUE" session at the very end produces this:

 k | v 
---+---
 1 | 1
 2 | 2
 3 | 1
 4 | 3

This doesn't surprise me. It's ultimately illegal. But not yet. (Before "RED" committed, "BLUE" didn't see the rows with "k = 1" and "k = 2". So it isn't seeing any other sessions uncommitted data—but only it's own uncommitted data.)

Then, when "BLUE" commits, it (of course) gets this:

ERROR:  duplicate key value violates unique constraint "t_v_unq"
DETAIL:  Key (v)=(1) already exists.

Then it sees (of course, again) only the rows with "k = 1" and "k = 2"—the same as what "RED" saw.

It seems to be impossible to do a test in slow motion where "RED" and "BLUE" each issues "commit" at the exact same moment. So thinking about this scenario doesn't tell me if:

(a) Each session runs its constraint check and the rest of what "commit" entails in a genuinely serialized fashion.

OR

(b) Each session first runs its constraint check (and some other stuff) non-serializedly—and only then runs the small part of the total "commit" action (the WAL part) serializedly. (This would result in bad data in the database at rest—just as my contrived misuse of "set constraints all immediate" left things in my "one or two admins" scenario.)

I appreciate that this just is a different wording of what I wrote before—but now w.r.t. the system-implemented unique constraint use-case.

The (a) scheme sounds correct. And the (b) scheme sounds wrong. Why would PG prefer to implement (b) rather than (a)?

I'm clearly missing something.

So it all depends on a lawyerly reading of the wording "at COMMIT time". The current CREATE TABLE doc says this:

«
If the constraint is INITIALLY DEFERRED, it is checked only at the end of the transaction.
»

The wording "at the end of the transaction" is not precise enough to adjudicate—and so the key question remains: Is a deferred constraint checked:

(a) as part of the strictly serialized operations that implement a session's COMMIT?

or

(b) momentarily *before* COMMIT and not within the serialized COMMIT execution?

So… (asking the wider audience) is the answer (a) or (b)? An if it's (b), why? After all, (b) brings the race condition risk. Is (a) simply not feasible?

COMMITs are not serialized. You seem to think that as soon as one transaction's COMMIT starts processing, no other transaction may COMMIT at the same time. That is not the case.

Yes, I most certainly did think this.

Where, in the PG doc, can I read the account of the proper mental model for the application programmer? It seems to be impossible to conduct an experiment that would disprove the hypothesis that one, or the other, of these mental models is correct.

Is my entire concept (and Laurenz's too) fundamentally flawed? Specifically, is querying a trigger's base table in a "for each row" trigger fundamentally unsound and not supported?

My post claims that constraint triggers alone are *not* a sufficient solution to validate constraints - you need additional locking or SERIALIZABLE isolation to make that work reliably.

This doesn't seem to be what you wrote. These two headings [...]

Then I must have been unclear. Or you only looked at the headings.

As I reason it, if you use the SERIALIZABLE approach, then an ordinary immediate AFTER EACH STATEMENT trigger will work fine—precisely because of how that isolation level is defined. So here, a deferred constraint trigger isn't needed and brings no value.

Now that is absolutely true. If you use the big hammer of SERIALIZABLE, there can be no anomaly, and it is unnecessary to keep the window for a race condition small. Deferred triggers and constraints still have a value, because they see the state of the database at the end of the whole transaction.

This implies that if a deferred constraint trigger is to have any utility, it must be safe to use it (as I tested it) at the READ COMMITTED level. I do see that, though I appear to be testing this, I cannot do a reliable test because I cannot, in application code, open up, and exploit, a race condition window after COMMIT has been issued. (I *am* able to do this to expose the fact that "set constraints all immediate" is unsafe.)

This sentence lacks the definition of what you mean by "safe", on which all hinges.

If "safe" means that you can use them to make sure that a certain condition is always satisfied (like in a constraint), they are not safe. But that is not the only use for a trigger.

Your post's testcase used the condition "at least one guard on duty" and used pessimistic locking to enforce this rule—while, I assume, all sessions use just the default "read committed" isolation level. It also showed how to enforce the rule by having any session that performs the risky de-assignment of a guard use the "serializable" isolation level. This solution is easier to write—but (as you imply) is less performant in a system where many concurrent sessions attempt the risky operation at the same time. You mention too that the client must implement a re-try strategy—and this complicates the overall programming exercise. (Sadly, retry cannot be encapsulated in PL/pgSQL because a block that has an exception handler cannot issue "commit" and yet serialization errors (typically?) occur only at commit time.)

My testcase used a stricter rule: the table of staff must have exactly one or two rows where the job is "Admin". So, here, concurrent sessions can break the rule (when the txn starts with one "Admin") by updating different rows to make them "Admin" or by inserting different new "Admin" rows. I've convinced myself by experiment that an ordinary trigger can enforce this rule when contending sessions use "serializable" isolation. Am I right that you'd say that no pessimistic locking scheme can enforce the rule at lower isolation levels except the brute-force "lock table"?