On Wed, Jul 11, 2012 at 9:24 AM, Yan Chunlu <springrider@xxxxxxxxx> wrote: > I have logged one day data and found the checkpoint is rather > frequently(detail: https://gist.github.com/3088338). Not sure if it is > normal, but the average time of checkpoint is about 100sec~200sec, it seems > related with my settings: > > 574 checkpoint_segments = 64 > 575 wal_keep_segments = 5000 > > I set checkpoint_segments as a very large value which is because otherwise > the slave server always can not follow the master, should I lower that > value? > > or the slow query is about something else? thanks! Some things to notice from the checkpoints log: * All chcekpoints are triggered by checkpoint_timeout, using up only a couple log files * Checkpoints write out around 40MB of buffers * The write out period is spread out nicely like it's supposed to but the sync phase is occasionally taking a very long time (more than 2 minutes) This looks like something (not necessarily the checkpoint sync itself) is overloading the IO system. You might want to monitor the IO load with iostat and correlate it with the checkpoints and slow queries to find the culprit. It's also possible that something else is causing the issues. If the cause is checkpoints, just making them less frequent might make the problem worse. I'm assuming you have 16GB+ of RAM because you have 4GB of shared_buffers. Just making checkpoint_timeout longer will accumulate a larger number of dirty buffers that will clog up the IO queues even worse. If you are on Linux, lowering dirty_expire_centisecs or dirty_background_bytes might help to spread the load out but will make overall throughput worse. On the otherhand, if the I/O overload is from queries (more likely because some checkpoints sync quickly) there are no easy tuning answers. Making queries less IO intensive is probably the best you can do. From the tuning side, newer Linux kernels handle I/O fairness a lot better, and you could also try tweaking the I/O scheduler to achieve better throughput to avoid congestion or at least provide better latency for trivial queries. And of course its always possible to throw more hardware at the problem and upgrade the I/O subsystem. Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance