On Thu, Jan 31, 2013 at 11:41:09PM +0100, Jan Kara wrote: > On Thu 31-01-13 10:16:00, Dave Chinner wrote: > > On Tue, Jan 22, 2013 at 03:03:37PM +0100, Jan Kara wrote: > > > Hello, > > > > > > On Fri 18-01-13 17:10:07, Josef Bacik wrote: > > > > I'd like to talk about what to do about O_DIRECT. Nobody really owns it > > > > and nobody really _wants_ to own it, and we've all been tacking on our > > > > own file systems optimizations and work arounds to make the generic stuff > > > > work. I'm to the point now where I'm just going to do all the work > > > > ourselves inside of btrfs since we need to have different waiting rules. > > > > So the question is do we want to just rm -f fs/direct-io.c and let > > > > everybody do their own thing, > > > I don't think we really can. Just grep for its uses. There are like 15 > > > filesystems using it. That would be a huge amount of duplication. > > > > > > > or is there some way we can tease out the > > > > actual generic stuff that everybody is going to need to do and adapt > > > > everybody to use that? And then theres the question of what are the > > > > things we want to do in the generic code, do we want to just do the get > > > > pages thing, do we want to still have stuff to build and submit the bios? > > > > What about how AIO interacts with it? > > > I'm not sure what issues you are exactly facing but I can understand > > > blockdev_direct_IO() isn't doing what btrfs would need. And I also agree > > > with others that the code is rather complex and hard to maintain. E.g. the > > > get_block_t insanity of using buffer_head is nagging me for a long time. > > > The handling of unaligned DIO which all filesystems just serialize (at least > > > for writes) because it causes data corruption. But these are mostly smaller > > > gradual improvements. > > > > The problem I find is that small gradual improvements is that every > > time I try to do one I end up with some wierd subtle problem that > > I've been unable to debug. It's happened several times in the past > > year, and each time I've given up on trying to make gradual > > improvements because of this.... > > > > That fits my definition of unmaintainable code almost perfectly. > > > > > IMHO the devil is in "show me the code that is flexible enough to work > > > for most, fast, and simpler than what we have". So I think we can speak > > > about what btrfs (or xfs or whoever else) would need and how we could > > > change (or whether it's worth to change) the generic code to accommodate > > > its needs. Hum? > > > > I'd say XFS needs very little outside help - AFAICT it still has >90% > > of the infrastructure it needs to do direct IO itself.... > I'm not sure it's really about infrastructure. When I look into > fs/direct-io.c, it is theoretically a trivial thing - just a loop with get > pages, map blocks, submit bio, repeat. But then there are the details of > blocksize < pagesize or even DIO not aligned to blocksize, holes in files, > throttling so that we don't have too many bios in flight... Sure, that's relatively simple, but once you optimise it repeated to the point where the order of single instructions is important even simple code becomes a tangled, ugly mess. > and suddently > we have the beast it is now. And every filesystem will have to deal with > these special cases so doing it in the generic code looks like a good > thing to me (I can imagine those nasty subtle bugs when each filesystem has > to handle all the cases on its own - and I believe XFS may get it right > pretty quickly but world isn't just XFS)... The advantage of using shared code is that it eases the burden of maintenance and enhancement on individual filesystems. Both Josef and I are putting forward the argument that the shared direct IO code provides neither of those advantages any more due to current complexity and fragility that has resulted from the monolithic "everything for everyone" approach we currently have. What I'm trying to say is that maybe there's a better way of providing generic direct IO support. Perhaps we are better served by having smaller generic helpers similar to the buffered IO path to allow filesystems to the simple stuff as optimally as possible without all the overhead they don't need. One-size-fits-all has never worked in the filesystems game, yet we seem to be stuck on that approach here even when it appears to be collapsing under it's own weight.... :/ Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html