[std_disclaimer]I'm not a DBA and I'm running PostgreSQL on a quad-core Intel laptop. You may read on after you stop laughing...[/std_disclaimer] I've written a version of UPSERT designed to import a large number of records. It works in two steps: it UPDATEs incumbent records that share common keys with the new records, then INSERTs new records that do not share common keys with the incumbent records. The gist of it is: SAVEPOINT ...; -- update records from subquery that share common keys with tbl UPDATE tbl SET f1 = X.f1, f2 = X.f2, ... FROM (subquery) AS X WHERE tbl.k1 = X.k1 AND tbl.k2 = X.k2 AND ...; -- insert records from subquery that do NOT share common keys with tbl INSERT INTO tbl (f1, f2, ...) SELECT Y.f1, Y.f2, ... FROM (subquery) AS Y LEFT OUTER JOIN tbl ON tbl.k1 = Y.k1 AND tbl.k2 = Y.k2 AND ... WHERE tbl.id IS NULL; RELEASE SAVEPOINT ...; The (subquery) frequently generates 5000 records for update/insert. Three Unix processes are running the import process concurrently, all updating/inserting records into a common table. Most of the time this works, but I've observed examples where an import process will * get a "PG::Error: ERROR: current transaction is aborted, commands ignored until end of transaction block" * get a "PG::Error: ERROR: deadlock detected" * go catatonic for several minutes This leads me to believe that my SAVEPOINT / RELEASE SAVEPOINT is not the correct way to protect the system from multiprocessing mischief. I've read Tom Lane's presentation on concurrency (http://www.postgresql.org/files/developer/concurrency.pdf), but I haven't been able to figure out the right solution. [For what it's worth, with the data I'm importing, INSERTs will far outnumber the UPDATES.] Since I someday plan to spin this code up on a big system with >> 3 import processes, I'd like to get this right. Hints and solutions warmly accepted. Thanks. - rdp -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general