"Uwe C. Schroeder" <uwe@oss4u.com> writes: > On Thursday 15 April 2004 04:20 pm, Tom Lane wrote: >> ... The problem is that the SELECT >> doesn't get the right result. The SELECT actually sees two row versions >> as being valid: the one you just created by UPDATE, and whichever one >> was current when the outer transaction started. > One question to "was current when the outer transaction started". I knew I wasn't going to get away with that explanation ;-). Okay, here's the long version. Let's imagine this situation: initially there's one row with last_value 1. Transaction A comes and does begin; select ib_nextval('mycounter'); but doesn't commit yet. Now there are two rows: one with last_value 1, which is marked as created by some past committed transaction, and as deleted by transaction A which is as yet uncommitted. There is also a row with last_value 2, which is marked as created by transaction A and deleted by nobody. Now transaction B comes and does select ib_nextval('mycounter'); It's going to block until A commits --- in your version, it blocks at the LOCK-table-exclusively command, in mine at the UPDATE because the UPDATE sees the row lock on the updated row. But in either case it waits. Once xact A commits, B proceeds to mark the row with last_value 2 as deleted by itself, and creates a row with last_value 3, created by itself and deleted by nobody. (This row will have last_value 3, not something less, because the UPDATE will use the latest available committed row as the starting point for its "last_value+1" computation.) Now we come to the tricky part: transaction B does its SELECT. Which of the three available rows will it consider valid? Because this SELECT is inside a function, and we don't advance the QuerySnapshot inside a function, the SELECT will be applying MVCC rules with respect to a snapshot that was taken when the outer "select ib_nextval()" began --- in other words, before transaction A committed. So the validity checks stack up like this: * original row with last_value 1: created by a long-since-committed transaction, and deleted by a transaction (xact A) that had not committed at the time of the snapshot. Ergo, good. * second row with last_value 2: created by a not-yet-committed xact (A) and deleted by my own transaction. Loses on either count; not good. * third row with last_value 3: created by my own transaction and deleted by nobody. Ergo, good. So both last_value 1 and last_value 3 are visible to the SELECT, and it's a crapshoot which will come up first in SELECT INTO. If we were to advance the QuerySnaphot between statements of a plpgsql function, the problem would go away because the SELECT would see transaction A as already committed, making the original row not-good. Now in this situation it is good to recognize the effects of other transactions between statements of a plpgsql function, but it's not hard to think up cases in which plpgsql functions would break if the visible database state changes between statements. So it's a bit of a tough choice what to do. I'm personally starting to think that we *should* advance the QuerySnapshot, but as I said there's not yet a consensus about it. Oh, one other point: SELECT FOR UPDATE fixes this because it has different visibility rules. Like UPDATE, it will *never* consider good a row version that is marked as deleted by any committed transaction. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster