Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
IOWs, there are two parts to the problem:
1 - guaranteeing I/O ordering
2 - guaranteeing blocks are on persistent storage.
Right now, a single barrier I/O is used to provide both of these
guarantees. In most cases, all we really need to provide is 1); the
need for 2) is a much rarer condition but still needs to be
provided.
if I am understanding it correctly, the big win for barriers is that you
do NOT have to stop and wait until the data is on persistant media before
you can continue.
Yes, if we define a barrier to only guarantee 1), then yes this
would be a big win (esp. for XFS). But that requires all filesystems
to handle sync writes differently, and sync_blockdev() needs to
call blkdev_issue_flush() as well....
So, what do we do here? Do we define a barrier I/O to only provide
ordering, or do we define it to also provide persistent storage
writeback? Whatever we decide, it needs to be documented....
The block layer already has a notion of the two types of barriers, with
a very small amount of tweaking we could expose that. There's absolutely
zero reason we can't easily support both types of barriers.
That sounds like a good idea - we can leave the existing
WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED
behaviour that only guarantees ordering. The filesystem can then
choose which to use where appropriate....
Precisely. The current definition of barriers are what Chris and I came
up with many years ago, when solving the problem for reiserfs
originally. It is by no means the only feasible approach.
I'll add a WRITE_ORDERED command to the #barrier branch, it already
contains the empty-bio barrier support I posted yesterday (well a
slightly modified and cleaned up version).
Wait. Do filesystems expect (depend on) anything but ordering now? Does
md? Having users of barriers as they currently behave suddenly getting
SYNC behavior where they expect ORDERED is likely to have a negative
effect on performance. Or do I misread what is actually guaranteed by
WRITE_BARRIER now, and a flush is currently happening in all cases?
And will this also be available to user space f/s, since I just proposed
a project which uses one? :-(
I think the goal is good, more choice is almost always better choice, I
just want to be sure there won't be big disk performance regressions.
--
bill davidsen <davidsen@xxxxxxx>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html