I have a Java application that tries to synchronize tables in two databases (remote source to local target). It does so by removing all constraints, then it compares table contents row by row, inserts missing rows and deletes "extra" rows in the target database. Delete performance is incredibly bad: it handles 100 record deletes in about 16 to 20 seconds(!). Insert and update performance is fine.
The Java statement to handle the delete uses a prepared statement:
"delete from xxx where xxx_pk=?"
The delete statement is then executed using addBatch() and executeBatch() (the latter every 100 deletes), and committed. Not using executeBatch makes no difference.
An example table where deletes are slow:
pzlnew=# \d cfs_file
Table "public.cfs_file"
Column | Type | Modifiers
------------------+-----------------------------+-----------
cfsid | bigint | not null
cfs_date_created | timestamp without time zone | not null
cfs_name | character varying(512) | not null
cfs_cfaid | bigint |
cfs_cfdid | bigint |
Indexes:
"cfs_file_pkey" PRIMARY KEY, btree (cfsid)
with no FK constraints at all, and a table size of 940204 rows.
While deleting, postgres takes 100% CPU all of the time.
Inserts and updates are handled in exactly the same way, and these are a few orders of magnitude faster than the deletes.
I am running the DB on an Ubuntu 12.04 - 64bits machine with Postgres 9.1, the machine is a fast machine with the database on ssd, ext4, with 16GB of RAM and a i7-3770 CPU @ 3.40GHz.
Anyone has any idea?
Thanks in advance,
Frits