On Tue, 2009-02-03 at 12:24 +1100, Nick Piggin wrote: > On Friday 30 January 2009 12:23:15 Jan Kara wrote: > > Hi, > > > > today I found that commit 31a12666d8f0c22235297e1c1575f82061480029 (mm: > > write_cache_pages cyclic fix) slows down operations over Berkeley DB. > > Without this "fix", I can add 100k entries in about 5 minutes 30s, with > > that change it takes about 20 minutes. > > What is IMO happening is that previously we scanned to the end of file, > > we left writeback_index at the end of file and went to write next file. > > With the fix, we wrap around (seek) and after writing some more we go > > to next file (seek again). We also found this commit causes about 40~50% regression with iozone mmap-rand-write. #iozone -B -r 4k -s 64k -s 512m -s 1200m My machine has 8GB memory. > Hmm, but isn't that what pdflush has asked for? It is wanting to flush > some of the dirty data out of this file, and hence it wants to start > from where it last flushed out and then cycle back and flush more? > > > > Anyway, I think the original semantics of "cyclic" makes more sence, just > > the name was chosen poorly. What we should do is really scan to the end of > > file, reset index to start from the beginning next time and go for the next > > file. > > Well, if we think of a file as containing a set of dirty pages (as it > appears to the high level mm), rather than a sequence, then behaviour > of my patch is correct (ie. there should be no distinction between dirty > pages at different offsets in the file). > > However, clearly there is some problem with that assumption if you're > seeing a 4x slowdown :P I'd really like to know how it messes up the IO > patterns. How many files in the BDB workload? Are filesystem blocks > being allocated at the end of the file while writeout is happening? > Delayed allocation? > > > > I can write a patch to introduce this semantics but I'd like to hear > > opinions of other people before I do so. > > I like dirty page cleaning to be offset agnostic as far as possible, > but I can't argue with numbers like that. Though maybe it would be > possible to solve it some other way. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html