On Wed, Jun 29, 2011 at 10:26:33AM -0700, Curt Wohlgemuth wrote: > The semantics did change between 2.6.34 and 2.6.35 though. When the > work item queue was introduced in 2.6.32, the semantics changed from > what you describe above to what's present through 2.6.34: > writeback_inodes_sb() would enqueue a work item, and return. Your > commit 83ba7b07 ("writeback: simplify the write back thread queue") > added the wait_for_completion() call, putting the semantics back to > where they were pre-2.6.32. Yes. The kernels inbetween had that nasty writeback vs umount races that we could trigger quite often. > Though one additional change between the old way (pre-2.6.32) and > today: with the old kernel, the pdflush thread would operate > concurrently with the first (and second?) sync path through writeback. > Today of course, they're *all* serialized. So really a call to > sys_sync() will enqueue 3 work items -- one from > wakeup_flusher_threads(), one from writeback_inodes_sb(), and one from > sync_inodes_sb(). Yes. And having WB_SYNC_NONE items from both wakeup_flusher_threads vs writeback_inodes_sb is plain stupid. The initial conversion to the new sync_filesystem scheme had removed the wakeup_flusher_threads call, but that caused a huge regression in some benchmark. As mentioned before Wu was working on some code to introduce tagging so that the WB_SYNC_ALL call won't start writing pages dirtied after the sync call, which should help with your issue. Although to completely solve it we really need to get down to just two passes. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html