Hello,
I experience significant performance issues with postgresql and updates.
I have a table which contains ~1M rows.
Layout:
TOTO=# \d versions_9d;
Table «public.versions_9d»
Colonne | Type | Modificateurs
------------+------------------------+---------------
hash | character(32) |
date | integer | default 0
diff | integer | default 0
flag | integer | default 0
size | bigint | default 0
zip_size | bigint | default 0
jds | integer | default 0
scanned | integer | default 0
dead | integer | default 0
Test case:
Create a new DB and load a dump of the above database with 976009 rows,
then i perform updates on the whole table. I recorded the time taken
for each full update and the amount of extra disk space used. Each
consecutive update of the table is slower than the previous
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=2"
UPDATE 976009
real 0m41.542s
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=3"
UPDATE 976009
real 0m45.140s (+480M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=4"
UPDATE 976009
real 1m10.554s (+240M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=5"
UPDATE 976009
real 1m24.065s (+127M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=6"
UPDATE 976009
real 1m17.758s (+288M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=7"
UPDATE 976009
real 1m26.777s (+288M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
UPDATE 976009
real 1m39.151s (+289M)
Then i tried adding an index to the table on the column date (int) that
stores unix timestamps.
TOTO=# CREATE INDEX versions_index ON versions_9d (date);
(-60M) disk space goes down on index creation
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=9"
UPDATE 976009
real 3m8.219s (+328M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
UPDATE 976009
real 6m24.716s (+326M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=10"
UPDATE 976009
real 8m25.274s (+321M)
As a sanity check, i loaded mysql5 and tried the same database and
updates. With mysql, the update always lasts ~8s.
The conclusions I have come to is that update==insert+delete which seems
very heavy when index are present (and heavy disk wise on big tables).
Is there a switch i can flip to optimise this?
Thanks in advance,
Gabriel Biberian