Re: Performance testing of various barrier reduction patches [was: Re: [RFC v4] ext4: Coordinate fsync requests]

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Fri, 15 Oct 2010 16:39:04 -0700

On Tue, Oct 12, 2010 at 04:14:55PM +0200, Christoph Hellwig wrote:
> I still think adding code to every filesystem to optimize for a rather
> stupid use case is not a good idea.  I dropped out a bit from the
> thread in the middle, but what was the real use case for lots of
> concurrent fsyncs on the same inode again?

The use case I'm looking at is concurrent fsyncs on /different/ inodes,
actually.  We have _n_ different processes, each writing (and fsyncing) its own
separate file on the same filesystem.

iirc, ext4_sync_file is called with the inode mutex held, which prevents
concurrent fsyncs on the same inode.

> And what is the amount of performance you need?  If we go back to the
> direct submission of REQ_FLUSH request from the earlier flush+fua
> setups that were faster or high end storage, would that be enough for
> you?
> 
> Below is a patch brining the optimization back.
> 
> 	WARNING: completely untested!

So I hacked up a patch to the block layer that collects measurements of the
time delay between blk_start_request and blk_finish_request when a flush
command is encountered, and what I noticed was that there's a rather large
discrepancy between the delay as observed by the block layer and the delay as
observed by ext4.  In general, the discrepancy is a nearly 2x increase between
what the block layer sees and what ext4 sees, so I'll give Christoph's
direct-flush patch (below) a try over the weekend.

--D
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html