Hi Michael, Michael Fuhr wrote: >>> Have you tweaked postgresql.conf at all? If so, what non-default >>> settings are you using? >> >> Yes, I have tweaked the following settings: >> >> shared_buffers = 40000 >> work_mem = 512000 >> maintenance_work_mem = 512000 >> max_fsm_pages = 40000 >> effective_cache_size = 131072 > > Are you sure you need work_mem that high? How did you decide on > that value? I have used http://www.powerpostgresql.com/Downloads/annotated_conf_80.html , expecting that the differences between 8.0 and 8.1 do not invalidate the recommendations. I have checked with (some) of my (large) queries and adjusted upward untill I had no temp files in the PGDATA/base/DB_OID/pgsql_tmp. (The warning about > Are all other settings at their defaults? Yep. > No changes to the write ahead log (WAL) or background writer (bgwriter) settings? No, because the forementioned document explicitely states that it has recomendations on these subjects. > What version of PostgreSQL are you running? The paths in your > original message suggest 8.1.x. Debian's Ecth 8.1.0-3 > A checkpoint updates the database files with the data from the > write-ahead log; you're seeing those writes to the database partition. > The postmaster does checkpoints every checkpoint_timeout seconds > (default 300) or every checkpoint_segment log segments (default 3); > it also uses a background writer to trickle pages to the database > files between checkpoints so the checkpoints don't have as much > work to do. I've been wondering if your pgbench runs are being > affected by that background activity; the fact that you get > consistently good performance after forcing a checkpoint suggests > that that might be the case. OK, thanks. To be sure if I understand it correctly: 1. Every update/insert is first written to a WAL log file which is in the PGDATA/pg_xlog directory. 2. Routinely the background writer than writes the changes to the PGDATA/base/DB_OID/ directory. 2. Postmaster forces after 300 secs or if the log segments are full (which ever comes first?) a checkpoint so that the WAL log file are empty ( I assume that that are the changes the background writer has not written yet since the last forced checkpont?). > If you run pgbench several times without intervening checkpoints, > do your postmaster logs have any messages like "checkpoints are > occurring too frequently"? It might be useful to increase > checkpoint_warning up to the value of checkpoint_timeout and then > see if you get any such messages during pgbench runs. If checkpoints > are happening a lot more often than every checkpoint_timeout seconds > then try increasing checkpoint_segments (assuming you have the disk > space). After doing so, restart the database and run pgbench several > times without intervening checkpoints and see if performance is > more consistent. I will try that this day. > Note that tuning PostgreSQL for pgbench performance might be > irrelevant for your actual needs unless your usage patterns happen > to resemble what pgbench does. The advantage of using pgbench is a repeatable short command that leads to something that is showing in actual real world usage. My problem is with the raw performance of my disk array (3Ware 9500S-8 SATA RAID5 controller with 5 disks). I am having *very* serious performance problems if I do large updates on my databases. E.g. an update of 1 (boolean) column in a table (update prototype.customers set deleted = false) that has 368915 records last forever (> 3500 secs ). The only noticable disk activity during such an update is on the disk/partition that has the PGDATA/base/DB_OID/ directory (/dev/sdc, the 3Ware 9800S-8 RAID 5 array). There is *no* noticable disk activity on the disk/partition that hase the PGDATA/pg_xlog directory (/dev/sdb, on a Sil 3114 on-board SAT controller). The throughtput during the update is ~ 2 MB/sec. The thoughtput during a large file copy or running bonnie (a benchmark) is > 40 MB/sec. My primary goal is to understand the differences ( and than sue the guilty ones ;-)), and than maybe either learn to live with it or find a solution. The number of write operations/sec during the update is ~ 2000 /sec. I suspect that the RAID card cannot handle a lot of small write operations (with fsync?) in a short time without performance penalty (and yes, the write cache on the controller is enabled). Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: J.Kraaijeveld@xxxxxxxxxx web: www.askesis.nl