On Sun, Sep 11, 2011 at 17:23, Andy Colson <andy@xxxxxxxxxxxxxxx> wrote: > On 09/11/2011 08:59 AM, Igor Chudov wrote: >> By the way, does that INSERT UPDATE functionality or something like this exist in Postgres? > You have two options: > 1) write a function like: > create function doinsert(_id integer, _value text) returns void as > 2) use two sql statements: Unfortunately both of these options have caveats. Depending on your I/O speed, you might need to use multiple loader threads to saturate the write bandwidth. However, neither option is safe from race conditions. If you need to load data from multiple threads at the same time, they won't see each other's inserts (until commit) and thus cause unique violations. If you could somehow partition their operation by some key, so threads are guaranteed not to conflict each other, then that would be perfect. The 2nd option given by Andy is probably faster. You *could* code a race-condition-safe function, but that would be a no-go on a data warehouse, since each call needs a separate subtransaction which involves allocating a transaction ID. ---- Which brings me to another important point: don't do lots of small write transactions, SAVEPOINTs or PL/pgSQL subtransactions. Besides being inefficient, they introduce a big maintenance burden. In PostgreSQL's MVCC, each tuple contains a reference to the 32-bit transaction ID that inserted it (xmin). After hitting the maximum 32-bit value transaction ID, the number "wraps around". To prevent old rows from appearing as new, a "vacuum freeze" process will run after passing autovacuum_freeze_max_age transactions (200 million by default) to update all old rows in your database. Using fewer transaction IDs means it runs less often. On small databases, this is usually not important. But on a 10TB data warehouse, rewriting a large part of your database totally kills performance for any other processes. This is detailed in the documentation: http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance