On 7/23/14 3:40 PM, Tom Lane wrote:
John R Pierce <pierce@xxxxxxxxxxxx> writes:
On 7/23/2014 10:21 AM, Seamus Abshere wrote:
Upsert is usually defined [1] in reference to a violating a unique key:
Is this theoretically preferable to just looking for a row that
matches certain criteria, updating it if found or inserting otherwise?
what happens when two connections do this more or less concurrently, in
transactions?
For the OP's benefit --- the subtext John left unstated is that the
unique-key mechanism has already solved the problem of preventing
concurrent updates from creating duplicate keys. If we build a version of
UPSERT that doesn't rely on a unique index then it'll need some entirely
new mechanism to prevent concurrent key insertion. (And if you don't care
about concurrent cases, you don't really need UPSERT ...)
hi all,
What if we treat atomicity as optional? You could have extremely
readable syntax like:
-- no guarantees, no index required
UPSERT age = 5 INTO dogs WHERE name = 'Jerry';
-- optionally tell us how you want to deal with collision
UPSERT age = 3 INTO dogs WHERE name = 'Jerry' ON DUPLICATE KEY KEEP NEWEST;
UPSERT age = 5 INTO dogs WHERE name = 'Jerry' ON DUPLICATE KEY KEEP NEWEST;
-- only **require** (by throwing an error) a unique index or a locked table for queries like
UPSERT age = age+1 INTO dogs WHERE name = 'Jerry';
Obviously this flies in the face of what most people say the
"fundamental Upsert property" is [1]
At READ COMMITTED isolation level, you should always get an atomic insert or update [1]
I just think there are a lot of non-concurrent bulk loading and
processing workflows that could benefit from the performance advantages
of upsert (one trip to database).
Best, thanks,
Seamus
[1] http://www.pgcon.org/2014/schedule/events/661.en.html
--
Seamus Abshere, SCEA
https://github.com/seamusabshere