On Wed, 22 Aug 2007, Dmitry Potapov wrote:
I found this http://www.westnet.com/~gsmith/content/linux-pdflush.htm
If you do end up following up with this via the Linux kernel mailing list, please pass that link along. I've been meaning to submit it to them and wait for the flood of e-mail telling me what I screwed up, that will go better if you tell them about it instead of me.
I temporaly solved this problem by setting dirty_background_ratio to 0%. This causes the dirty data to be written out immediately. It is ok for our setup (mostly because of large controller cache), but it doesn't looks to me as an elegant solution. Is there some other way to fix this issue without disabling pagecache and the IO smoothing it was designed to perform?
I spent a couple of months trying and decided it was impossible. Your analysis of the issue is completely accurate; lowering dirty_background_ratio to 0 makes the system much less efficient, but it's the only way to make the stalls go completely away.
I contributed some help toward fixing the issue in the upcoming 8.3 instead; there's a new checkpoint writing process aimed to ease the exact problem you're running into there, see the new checkpoint_completion_target tunable at http://developer.postgresql.org/pgdocs/postgres/wal-configuration.html
If you could figure out how to run some tests to see if the problem clears up for you using the new technique, that would be valuable feedback for the development team for the upcoming 8.3 beta. Probably more productive use of your time than going crazy trying to fix the issue in 8.2.4.
-- * Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org