On Tue, 25 Dec 2007, fdu.xiaojf@xxxxxxxxx wrote:
insert a record into a table, and when the record already exists(according to the primary key), update it.
There is an example that does exactly that, 37-1, in the documentation at http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html It actually does the update first and only if that fails does the insert, which avoids the whole duplicate key issue altogether.
I have tried the query and update/insert way, and it was very slow when more than 1 million records have been inserted. (I have more than 20 million records to insert.)
This may be better because it isn't doing the query first. You may discover that you need to aggressively run one of the VACUUM processes (I'd guess regular and ANALYZE but not FULL) in order to keep performance steady as the number of records grows. Anytime you update a row, that becomes a dead row that's still taking up space, and if you do a lot of those they get in the way of finding the rows that are still live. Take a look at http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html to get an idea of the process.
-- * Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/