On Mon, Apr 28, 2014 at 1:41 PM, Tomas Vondra <tv@xxxxxxxx> wrote:
On 28.4.2014 16:07, Tom Lane wrote:There's certainly something fishy, because although this is the supposed
> Elanchezhiyan Elango <elanelango@xxxxxxxxx> writes:
>>> The problem is that while this makes the checkpoints less
>>> frequent, it accumulates more changes that need to be written to
>>> disk during the checkpoint. Which means the impact more severe.
>
>> True. But the checkpoints finish in approximately 5-10 minutes
>> every time (even with checkpoint_completion_target of 0.9).
>
> There's something wrong with that. I wonder whether you need to
> kick checkpoint_segments up some more to keep the checkpoint from
> being run too fast.
>
> Even so, though, a checkpoint spread over 5-10 minutes ought to
> provide the kernel with enough breathing room to flush things. It
> sounds like the kernel is just sitting on the dirty buffers until it
> gets hit with fsyncs, and then it's dumping them as fast as it can.
> So you need some more work on tuning the kernel parameters.
configuration:
the checkpoint logs typically finish in much shorter periods of time.
checkpoint_segments = 250
checkpoint_timeout = 1h
checkpoint_completion_target = 0.9
That doesn't look fishy to me. The checkpointer will not take more than one nap between buffers, so it will always write at least 10 buffers per second (of napping time) even if that means it finishes early. Which seems to be the case here--the length of the write cycle seems to be about one tenth the number of buffers written.
Even if that were not the case, it also doesn't count buffers written by the backends or the background writer as having been written, so that is another reason for it to finish early. Perhaps the fsync queue should pass on information of whether the written buffers were marked for the checkpointer. There is no reason to think this would improve performance, but it might reduce confusion.
Cheers,
Jeff