I have a production PostgreSQL instance (8.1 on Linux 2.6.15) that
seems to be writing data to disk at rates that I think are
disproportional to the update load imposed on the database. I am
looking for ways to determine the cause of this I/O.
As an example, here is a typical graph produced by Munin:
http://purefiction.net/paste/pg-iostat.png
Running an hourly iostat produces this output:
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 43.50 0.21 0.04 427915 72736
sda 43.62 0.21 0.04 428183 88904
sda 43.74 0.21 0.05 428440 104877
sda 43.90 0.21 0.06 428808 124681
sda 44.06 0.21 0.07 429111 145447
sda 44.27 0.21 0.08 429532 170317
sda 44.46 0.21 0.09 429985 193594
In other words, it's reading about 400MB/hour and writing around
15-20GB/hour, or exactly 118GB during the last six hours. To
determine how well this correlates to the actual inserts and updates
being performed on the database, I ran a loop alongside iostat that
executed "select sum(n_tup_upd), sum(n_tup_ins) from
pg_stat_all_tables" against PostgreSQL every hour and output the
difference. Here are a few samples:
| delta_upd | delta_ins |
+-----------+-----------+
| 7111 | 2343 |
| 7956 | 2302 |
| 7876 | 2181 |
| 9269 | 2477 |
| 8553 | 2205 |
For the write numbers to match the tuple numbers, each updated/
inserted tuple would have to average at least 1.5MB (15 GB divided by
10,000 tuples), which is not the case; the total size of the raw
tuples updated/inserted during the above session probably does not
exceed a couple of megabytes. Even considering overhead, page size,
MVCC, etc., this does not compute.
I have not narrowed this explicitly down to PostgreSQL, but since the
stock Linux kernel we are running does not offer per-process I/O
statistics, I cannot determine this for sure. However, except for the
PostgreSQL database, everything else on the box should be identical
to what we are running on other boxes, which are not exhibiting the
same kind of load. Oh, and I have PostgreSQL logging turned off.
Note that PostgreSQL's performance in itself seems fine, and
according to top/ps it's only very rarely in iowait.
Alexander.