On Dec 8, 2007, at 7:54 AM, John D. Burger wrote:
So two design patterns for a makeshift UPSERT have been presented -
one is to check beforehand, and only insert if the item isn't
present already
... which will give the wrong results if there's any concurrent
updates...
, the other is to do the insert blindly and let PG check for you,
and catch any exceptions.
I'm also wondering what people's ideas are for a sort of BULK
UPSERT. I often find myself inserting the results of a SELECT and
wanting a similar check for already existing rows. The idiom I've
stumbled upon looks like this:
insert into foo (x, y, z)
select a, b, c from bar join bax ...
EXCEPT
select x, y, z from foo;
Namely, I subtract from the results to be inserted any rows that
are already present in the target table.
This can actually even be used for UPSERTing a single row, and has
the virtue of being pure SQL, but I've wondered about its efficiency.
Worry more about it's correctness. Doing entirely the wrong thing,
quickly, isn't always what you want. If there's any concurrency
involved at all, this is likely to do the wrong thing.
One alternative would be to iterate over the SELECT result with a
procedural language, and do a series of UPSERTS, but that seems
unlikely to be as efficient for a large result set.
Just take the idiom that's been pointed out in the documentation and
wrap a loop around it.
Cheers,
Steve
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match