Josh Berkus wrote:
Given your analysis of fsync'ing behavior on Ext3, would you say that it is better to set checkpoint_completion_target to 0.0 on Ext3?
Setting that to 0.0 gives the same basic behavior as in 8.2 and earlier versions. Those had even worst I/O spikes issues. Even on ext3, there is value to spreading the writes around over time, particularly if you have a large setting for checkpoint_segments. Ideally the write phase will be spread out over 2.5 minutes, if you've set the segments high enough that checkpoints are being driven by checkpoint_timeout. The original testing myself and Heikki did settled on the default of 0.5 for checkpoint_completion_target on ext3, so that part hasn't really changed. It's still better than just writing everything in one big dump, as you'd see with it set to 0.0.
While Linux and ext3 aren't great about getting stuff to disk, doing some writing in advance of sync will improve things at least a little. The thing that I don't ever expect to work on ext3 is spreading the sync phase out over time.
P.S. those of you who are into filesystem trivia but don't read pgsql-hackers normally may enjoy http://blog.2ndquadrant.com/en/2011/01/tuning-linux-for-low-postgresq.html and http://archives.postgresql.org/message-id/4D4C4610.1030109@xxxxxxxxxxxxxxx which has the research Josh is alluding to here. I also just wrote a rebuttal today to the "PostgreSQL doesn't have hints" meme at http://blog.2ndquadrant.com/en/2011/02/hinting-at-postgresql.html
-- Greg Smith 2ndQuadrant US greg@xxxxxxxxxxxxxxx Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance