On Tue, 1 Apr 2008, mark wrote:
current settings all default#checkpoint_timeout = 5min
#checkpoint_segments = 3#checkpoint_warning = 30s
#checkpoint_completion_target = 0.5
42 | 1051 | 8035125 | 21954 |
this is what I have on pg_stat_bgwriter ; how much should I increase
checkpoint_segment & checkpoint_completion_target to? thanks a lot!
postgres=# select * from pg_stat_bgwriter;
checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean |42 | 241519 | 10394696
maxwritten_clean | buffers_backend | buffers_alloc
That means that 96% of the time, you are hitting a checkpoint because of activity in less than 5 minutes after the previous one. And 97% of the dirty buffers written out are being done by the checkpoint process. Your system is spending its entire life doing checkpoint work and it's no wonder response time is slow quite regularly.
this really clear! Thannks!!
Increasing checkpoint_segments uses up more disk space for the WAL files and increases the amount of time it will take to recover from a crash. If neither of those two things matter to you, you can easily bump that parameter up to 100 or more. I'd suggest starting with an order of magnitude change and increase from 3 to 30. You have to restart the server to make that change stick. It's impossible to predict how much that will change your situation, to know if that's enough or you really need to increase it further.
Based on what Tomasz suggested a day ago, I had changed settings to
checkpoint_segments = 16
checkpoint_timeout = 20min
checkpoint_completion_target = 0.8
and this seems to have improved performance significantly... data from almost 24 hours
postgres=# select * from pg_stat_bgwriter;
checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc
-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------
21 | 46 | 2218439 | 34158 | 145 | 60447 | 2707259
but i still do get statements that take over 2 or 3 seconds to execute sometimes... should I increase checkpoint_segments to 30 now? Should I change checkpoint_timeout & checkpoint_completion_target too??
i am fine with both WAL disk usage increasing and increase in time to recover from crash...