Re: background on the ext3 batching performance issue

David Chinner <dgc@xxxxxxx> · Fri, 29 Feb 2008 04:54:22 +1100

On Thu, Feb 28, 2008 at 08:09:57AM -0500, Ric Wheeler wrote:
> One more thought - what we really want here is to have a sense of the 
> latency of the device. In the S-ATA disk case, this optimization works 
> well for batching since we "spend" an extra 4ms worst case in the chance 
> of combining multiple, slow 18ms operations.
> 
> With the clariion box we tested, the optimization fails badly since the 
> cost is only 1.3 ms so we optimize by waiting 3-4 times longer than it 
> would take to do the operation immediately.
> 
> This problem has also seemed to me to be the same problem that IO 
> schedulers do with plugging - we want to dynamically figure out when to 
> plug and unplug here without hard coding in device specific tunings.
> 
> If we bypass the snippet for multi-threaded writers, we would probably 
> slow down this workload on normal S-ATA/ATA drives (or even higher 
> performance non-RAID disks).

It's the self-tuning aspect of this problem that makes it hard. In
the case of XFS, the way this tuning is done is that we look at the
state of the previous log I/O buffer to check if it is still syncing
to disk. If it is sync to disk, we go to sleep waiting for that log
buffer I/O to complete. This holds the current buffer open to
aggregate more transactions before syncing it to disk and hence
allows parallel fsyncs to be issued in the one log write. The fact
that it waits for the previous log I/O to complete means it
self-tunes to the latency of the underlying storage medium.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html