Re: Row locks, SKIP LOCKED, and transactions

Thomas Munro <thomas.munro@xxxxxxxxx> · Wed, 18 Dec 2019 15:10:27 +1300

On Wed, Dec 18, 2019 at 5:12 AM Steven Winfield
<Steven.Winfield@xxxxxxxxxxxxxxxxx> wrote:
> * I observe this even if I crank up the transaction isolation level to repeatable read and serializable.

Huh.  SERIALIZABLE shouldn't allow two transactions to see no result
row for a given ID and then insert a result row for that ID.  One of
those transactions should have to roll back, because otherwise it'd be
incompatible with both serial orderings of the two transactions.

> I'm wondering if row locks are not obeying the same transactional semantics as row data,

They are indeed a bit weird.  They sometimes check if the condition
still apply (since the row might have changed between the scan and
LockRows node) which leads to some interesting effects, but only if
the row being locked was concurrently updated, and here that isn't the
case.  This is a source of a fair amount of confusion about FOR UPDATE
and joins/subselects.

> Perhaps this is a misuse of the locking system, since I'm locking a row "FOR UPDATE" but not actually updating it, but as row locks are released at the end of a transaction (according to the docs) then my expectation was for the unlocking and the visibility of newly committed rows to be atomic.
> I've tried FOR NO KEY UPDATE too, without luck.
>
> If I'm doing something forbidden (and the docs say so) then I'd be grateful if someone could point that out!

Conceptually, the thing you really need to lock for this to work is
the result row that isn't there yet, so that some overlapping
transaction doesn't try to lock the same absent thing.  Unfortunately,
our system for locking things that aren't there isn't there either.
Some articles on serializability talk about "materialising the
conflict", which means locking some other surrogate thing that
"covers" a gap you are interested in.  You might think the job row
would do the trick, but since we don't recheck the condition (that is,
recheck that there is no corresponding result because you don't update
the job row), no cigar.  You could also use plain old
pg_try_advisory_xact_lock(id), because it just locks integers, and
they're always there.

SERIALIZABLE deals with that type of magic internally (it locks gaps
in key ranges by predicate-locking a physical btree or hash page that
you'd need to write on to insert a row with a matching key, which is
how it discovers a conflict between one transaction that went looking
for key=42 but didn't find it and another that later writes key=42),
but, as mentioned, SERIALIZABLE doesn't really allow concurrency with
this workload, and you specified that you wanted concurrency with SKIP
LOCKED (but I think you'd have the same problem without it; SKIP
LOCKED just gets you the wrong answer faster).

There are various ways you could deal with this, but I'd probably go
for a simple scheme where you only have to consult a single row to
know if you can claim it.  You could still put the results into a
separate table, but use job.state to find work, and set it to DONE
when you insert the result.  It may also be possible to add no new
columns but do a dummy update to the job row to get the join qual
rechecked, but I'm not sure if that'd work.  Another reason to add a
state column to the job table is so that you can put a conditional
index on it so you can find jobs to be done very quickly, if you're
not planning to remove the ones that are done.