David Chinner wrote:
On Thu, Feb 28, 2008 at 08:09:57AM -0500, Ric Wheeler wrote:
One more thought - what we really want here is to have a sense of the
latency of the device. In the S-ATA disk case, this optimization works
well for batching since we "spend" an extra 4ms worst case in the chance
of combining multiple, slow 18ms operations.
With the clariion box we tested, the optimization fails badly since the
cost is only 1.3 ms so we optimize by waiting 3-4 times longer than it
would take to do the operation immediately.
This problem has also seemed to me to be the same problem that IO
schedulers do with plugging - we want to dynamically figure out when to
plug and unplug here without hard coding in device specific tunings.
If we bypass the snippet for multi-threaded writers, we would probably
slow down this workload on normal S-ATA/ATA drives (or even higher
performance non-RAID disks).
It's the self-tuning aspect of this problem that makes it hard. In
the case of XFS, the way this tuning is done is that we look at the
state of the previous log I/O buffer to check if it is still syncing
to disk. If it is sync to disk, we go to sleep waiting for that log
buffer I/O to complete. This holds the current buffer open to
aggregate more transactions before syncing it to disk and hence
allows parallel fsyncs to be issued in the one log write. The fact
that it waits for the previous log I/O to complete means it
self-tunes to the latency of the underlying storage medium.....
Cheers,
Dave.
With the experiments we ran before, the heuristic did eventually start
helping when we hit really high numbers of concurrent writing threads on
the Clariion box. I forget how many, but it was at least 12 or so.
ric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html