Tom Lane wrote:
Joe Maldonado <jmaldonado@xxxxxxxxxxxxxxx> writes:most of the time is spent in D state. Also, the rest of the system isn't doing
Can concurrent updates/deletes slow down vacuum when it is progressing ? I
mean to ask if vacuum would have to redo or stall its work because of the
updates/deletes. Is it even possible that it goes into a long loop while
such updates occur ?
vacuum has to take an exclusive lock at the page level on any page it is trying to remove tuples from. The code is such that any ordinary operations on the same page will block the vacuum; vacuum doesn't get the lock until no one else is interested in the page. Given sufficiently heavy concurrent activity on the table, I suppose it could take quite a while for vacuum to finish.
The reason for my question is that I'm seeing vacuuming take several hours
on a big table (~1million rows) that is frequently being updated (1000
updates/min). They are run around 2 times a day and each time it takes ~3
hrs. There are various other processes on the box using the database, but
the I/O load isn't very high. When vacuum runs, it causes various I/O
tasks to run very slowly.
However, if the scenario I described were your problem, the vacuum would be spending most of its time just blocked waiting for page locks; it therefore wouldn't have much effect on I/O. What I suspect is that your machine has no I/O bandwidth to spare and the extra demands of the vacuum are just saturating your disk. You might look at iostat or vmstat output to see what's going on. Also check with ps or top to see if the vacuuming backend spends most of its time in "S" (sleep) or "D" (disk IO) state.
much I/O. iostat shows that without vacuum (on SCSI Raid1), our application
is not using much I/O on an avg (although there is bursty I/O)
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz /dev/sda 0.00 1.60 0.00 1.20 0.00 22.40 0.00 11.20 18.67 1.97
await svctm %util 164.17 42.50 5.10
and with vacuum
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz /dev/sda 0.20 96.20 0.30 90.70 4.00 1468.80 2.00 734.40 16.18 3050.21
await svctm %util 3244.55 10.99 100.00
A await time of ~3seconds is sure to slow down other queries from our application and the system in general. But our tables and nature of our I/O updates are bursty.
I have 2 questions :
1. Does Checkpointing write the pages that have been vacuumed or does vacuum not affect Checkpointing I/O ?
2. Since vacuum in 7. 4 capable of disrupting disk latency (although disk bandwidth is prob only 40% used) so much, given that it is in D state most of the time, is rewriting the application to use temp tables, truncate etc so as to avoid vacuum a good idea ?
Thanks.
If you are using PG 8.0 you could experiment with vacuum_cost_delay and associated parameters. This would slow down vacuum even more in terms of elapsed time to finish, but it should reduce the impact on concurrent processing.
regards, tom lane
OK thanks
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings