Alex - wrote:
These parameters are not so interesting on their own. The important thing to check is how often checkpoints are happening, and how much work each of them do. Here's a useful section from your logs to comment on:
Here you're getting a checkpoint every minute or three, and each of them is writing out ~80000 buffers = 625MB. That is crazy, and no wonder your queries are slow--the system is spending all of its time doing constant, extremely expensive checkpoints. You should re-tune this system until the checkpoints show up no more often than every 5 minutes, and preferably closer to 10. In your case, that means greatly increasing checkpoint_segments.. And you might as well decrease checkpoint_timeout, because right now the timeout is rarely ever being reached before the system runs out of working segments and executes a checkpoint for that reason. I would suggest changing the parameters to these: checkpoint_segments = 96 # in logfile segments, min
1, 16MB each
checkpoint_timeout = 10min # range 30s-1h
And seeing how long the interval between checkpoints becomes under load afterwards. That should make it in the just >5 minute range. I'd expect that the number of buffers will only increase a bit, so instead of dumping out 625MB every minute or three you should see maybe 800MB every 5 minutes--big improvement. Also: just after making the change, save a copy of: select now(),* from pg_stat_bgwriter And then run that same query again a few hours after the change (and maybe the next day twoo). Subtract the two values to see how much they changed, and then you'll find some useful info to compute the checkpoint interval without even looking at the logs. There's a lot of theory and comments about this area at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
See that last one? That's a checkpoint that's being driven by time expiring (30 minutes) rather than running out of segments. That just suggests that the system was mostly idle during that period. Long write times are perfectly fine here, the checkpoints are spread out over a long time if possible in order to reduce average I/O. You're just not seeing that the rest of the time because checkpoints are happening so often. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@xxxxxxxxxxxxxxx www.2ndQuadrant.com |