I still think adding code to every filesystem to optimize for a rather stupid use case is not a good idea. I dropped out a bit from the thread in the middle, but what was the real use case for lots of concurrent fsyncs on the same inode again? And what is the amount of performance you need? If we go back to the direct submission of REQ_FLUSH request from the earlier flush+fua setups that were faster or high end storage, would that be enough for you? Below is a patch brining the optimization back. WARNING: completely untested! Index: linux-2.6/block/blk-flush.c =================================================================== --- linux-2.6.orig/block/blk-flush.c 2010-10-12 10:08:43.777004514 -0400 +++ linux-2.6/block/blk-flush.c 2010-10-12 10:10:37.547016093 -0400 @@ -143,6 +143,17 @@ struct request *blk_do_flush(struct requ unsigned skip = 0; /* + * Just issue pure flushes directly. + */ + if (!blk_rq_sectors(rq)) { + if (!do_preflush) { + __blk_end_request_all(rq, 0); + return NULL; + } + return rq; + } + + /* * Special case. If there's data but flush is not necessary, * the request can be issued directly. * -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html