Hello Greg, Thanks for you extensive reply. 2010/1/9 Greg Smith <greg@xxxxxxxxxxxxxxx>: > Anton Belyaev wrote: >> >> I think all the IOwait comes during sync time, which is 80 s, >> according to the log entry. >> > > I believe you are correctly diagnosing the issue. The "sync time" entry in > the log was added there specifically to make it easier to confirm this > problem you're having exists on a given system. > >> bgwriter_lru_maxpages = 0 # BG writer is off >> checkpoint_segments = 45 >> checkpoint_timeout = 60min >> checkpoint_completion_target = 0.9 >> > > These are reasonable settings. You can look at pg_stat_bgwriter to get more > statistics about your checkpoints; grab a snapshot of that now, another one > later, and then compute the difference between the two. I've got an example > of that http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm > > You should be aiming to have a checkpoint no more than every 5 minutes, and > on a write-heavy system shooting for closer to every 10 is probably more > appropriate. Do you know how often they're happening on yours? Two > pg_stat_bgwriter snapshots from a couple of hours apart, with a timestamp on > each, can be used to figure that out. > Checkpoint happens about once an hour, sometimes a bit more offen (30 minutes) - during daily peaks. >> I had mostly the same config with my 8.3 deployment. >> But hardware is different: >> Disk is software RAID-5 with 3 hard drives. >> Operating system is Ubuntu 9.10 Server x64. >> > > Does the new server have a lot more RAM than the 8.3 one? Some of the > problems in this area get worse the more RAM you've got. > Yes, new server has 12 GB while old one only 8 GB. > Does the new server use ext4 while the old one used ext3? > Same ext3 filesystem. > Basically, you have a couple of standard issues here: > > 1) You're using RAID-5, which is not known for good write performance. Are > you sure the disk array performs well on writes? And if you didn't > benchmark it, you can't be sure. > I did some dd benchmarks (according to http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm): Old server with its "hardware RAID-1" shows 60 mb/s on write. New server with software RAID-5 shows 85 mb/s on write. > 2) Linux is buffering a lot of writes that are only making it to disk at > checkpoint time. This could be simply because of (1)--maybe the disk is > always overloaded. But it's possible this is just due to excessive Linux > buffering being lazy about the writes. I wrote something about that topic > at http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html you > might find interesting. > Old server has dirty_ratio = 10 dirty_background_ratio = 5 New server had dirty_ratio = 20 dirty_background_ratio = 10 Assuming all the tests and measures above: Server has more RAM, leaving Linux some room for write cache. During dd test DirtyPages of /proc/meminfo were up to 2 GB. RAID-5 is a bit faster (at least on sequential write). Drives arent overloaded, because their utilization during lengthy checkpoint is low. IOwait problems occur only at final sync part of checkpoint. And during this short period drives are almost 100% utilized (according to sar -d 1). I played a bit, setting dirty_background_ratio = 1, but this had negative effect somehow. And this is strange. I hoped this will force to distribute the load from 2 min sync period to 1 hour checkpoint span, but it did not. As the result, I am dont know still where is the real problem. Drives arent overloaded. Linux cache is really misterious, but modifying its parameters does not give the desired effect. Thanks. Anton. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general