Sorry, hit "send" too early by accident. On 28.4.2014 16:07, Tom Lane wrote: > Elanchezhiyan Elango <elanelango@xxxxxxxxx> writes: >>> The problem is that while this makes the checkpoints less >>> frequent, it accumulates more changes that need to be written to >>> disk during the checkpoint. Which means the impact more severe. > >> True. But the checkpoints finish in approximately 5-10 minutes >> every time (even with checkpoint_completion_target of 0.9). > > There's something wrong with that. I wonder whether you need to > kick checkpoint_segments up some more to keep the checkpoint from > being run too fast. Too fast? All the checkpoints listed in the log were "timed", pretty much exactly in 1h intervals: Apr 26 00:12:57 LOG: checkpoint starting: time Apr 26 01:12:57 LOG: checkpoint starting: time Apr 26 02:12:57 LOG: checkpoint starting: time Apr 26 03:12:57 LOG: checkpoint starting: time Apr 26 04:12:58 LOG: checkpoint starting: time Apr 26 05:12:57 LOG: checkpoint starting: time Apr 26 06:12:57 LOG: checkpoint starting: time There's certainly something fishy, because although this is the supposed configuration: checkpoint_segments = 250 checkpoint_timeout = 1h checkpoint_completion_target = 0.9 the checkpoint logs typically finish in much shorter periods of time. Like this, for example: Apr 26 10:12:57 LOG: checkpoint starting: time Apr 26 10:26:27 LOG: checkpoint complete: wrote 9777 buffers (15.3%); 0 transaction log file(s) added, 0 removed, 153 recycled; write=800.377 s, sync=8.605 s, total=809.834 s; sync files=719, longest=1.034 s, average=0.011 s And that's one of the longer runs - most of the others run in ~5-6 minutes. Now, maybe I'm mistaken but I'd expect the checkpoints to finish in ~54 minutes, which is (0.9*checkpoint_completion_target). > Even so, though, a checkpoint spread over 5-10 minutes ought to > provide the kernel with enough breathing room to flush things. It > sounds like the kernel is just sitting on the dirty buffers until it > gets hit with fsyncs, and then it's dumping them as fast as it can. > So you need some more work on tuning the kernel parameters. I'm not sure about this - the /proc/meminfo snapshots sent in the previous post show that the amount of "Dirty" memory is usually well below ~20MB, with max at ~36MB at 22:24:26, and within matter of seconds it drops down to ~10MB of dirty data. Also, the kernel settings seem quite aggressive to me: vm.dirty_background_ratio = 1 vm.dirty_background_bytes = 0 vm.dirty_ratio = 20 vm.dirty_bytes = 0 vm.dirty_writeback_centisecs = 500 vm.dirty_expire_centisecs = 500 regards Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance