Dan Armbrust <daniel.armbrust.list@xxxxxxxxx> writes: > Almost all of the slow query log messages are logged within about 3 > seconds of the checkpoint starting message. > LOG: checkpoint complete: wrote 9975 buffers (77.9%); 0 transaction > log file(s) added, 0 removed, 15 recycled; write=156.576 s, sync=0.065 > s, total=156.662 s Huh. And there's just about no daylight between the total checkpoint time and the write+sync time (barely more than 20msec in both examples). So that seems to wipe out the thought I had that maybe we'd underestimated the work involved in one of the other steps of checkpoint. As Greg commented upthread, we seem to be getting forced to the conclusion that the initial buffer scan in BufferSync() is somehow causing this. There are a couple of things it'd be useful to try here: * see how the size of the hiccup varies with shared_buffers; * try inserting a delay into that scan loop, as per attached quick-and-dirty patch. (Numbers pulled from the air, but we can worry about tuning after we see if this is really where the problem is.) regards, tom lane
*** src/backend/storage/buffer/bufmgr.c~ Tue Jan 1 14:45:51 2008 --- src/backend/storage/buffer/bufmgr.c Wed Jul 8 18:12:49 2009 *************** *** 1037,1042 **** --- 1037,1045 ---- } UnlockBufHdr(bufHdr); + + if ((buf_id % 256) == 0) + pg_usleep(1000L); } if (num_to_write == 0)
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general