On Thu, May 31 2007, Bill Davidsen wrote: > Jens Axboe wrote: > >On Thu, May 31 2007, David Chinner wrote: > > > >>On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote: > >> > >>>On Thu, May 31 2007, David Chinner wrote: > >>> > >>>>IOWs, there are two parts to the problem: > >>>> > >>>> 1 - guaranteeing I/O ordering > >>>> 2 - guaranteeing blocks are on persistent storage. > >>>> > >>>>Right now, a single barrier I/O is used to provide both of these > >>>>guarantees. In most cases, all we really need to provide is 1); the > >>>>need for 2) is a much rarer condition but still needs to be > >>>>provided. > >>>> > >>>> > >>>>>if I am understanding it correctly, the big win for barriers is that > >>>>>you do NOT have to stop and wait until the data is on persistant media > >>>>>before you can continue. > >>>>> > >>>>Yes, if we define a barrier to only guarantee 1), then yes this > >>>>would be a big win (esp. for XFS). But that requires all filesystems > >>>>to handle sync writes differently, and sync_blockdev() needs to > >>>>call blkdev_issue_flush() as well.... > >>>> > >>>>So, what do we do here? Do we define a barrier I/O to only provide > >>>>ordering, or do we define it to also provide persistent storage > >>>>writeback? Whatever we decide, it needs to be documented.... > >>>> > >>>The block layer already has a notion of the two types of barriers, with > >>>a very small amount of tweaking we could expose that. There's absolutely > >>>zero reason we can't easily support both types of barriers. > >>> > >>That sounds like a good idea - we can leave the existing > >>WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED > >>behaviour that only guarantees ordering. The filesystem can then > >>choose which to use where appropriate.... > >> > > > >Precisely. The current definition of barriers are what Chris and I came > >up with many years ago, when solving the problem for reiserfs > >originally. It is by no means the only feasible approach. > > > >I'll add a WRITE_ORDERED command to the #barrier branch, it already > >contains the empty-bio barrier support I posted yesterday (well a > >slightly modified and cleaned up version). > > > > > Wait. Do filesystems expect (depend on) anything but ordering now? Does > md? Having users of barriers as they currently behave suddenly getting > SYNC behavior where they expect ORDERED is likely to have a negative > effect on performance. Or do I misread what is actually guaranteed by > WRITE_BARRIER now, and a flush is currently happening in all cases? See the above stuff you quote, it's answered there. It's not a change, this is how the Linux barrier write has always worked since I first implemented it. What David and I are talking about is adding a more relaxed version as well, that just implies ordering. > And will this also be available to user space f/s, since I just proposed > a project which uses one? :-( I see several uses for that, so I'd hope so. > I think the goal is good, more choice is almost always better choice, I > just want to be sure there won't be big disk performance regressions. We can't get more heavy weight than the current barrier, it's about as conservative as you can get. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html