Re: bvec_iter.bi_sector -> loff_t? (was: Re: [PATCH] bcachefs: allow direct io fallback to buffer io for) unaligned length or offset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 20, 2024 at 03:49:54PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 20, 2024 at 10:16:02AM -0400, Kent Overstreet wrote:
> > That's really just descriptive, not prescriptive.
> > 
> > The intent of O_DIRECT is "bypass the page cache", the alignment
> > restrictions are just a side effect of that. Applications just care
> > about is having predictable performance characteristics.
> 
> But any application that has been written to use O_DIRECT already has the
> alignment & size guarantees in place.  What this patch is attempting to do
> is make it "more friendly" to use, and I'm not sure that's a great idea.
> Not without buy-in from a large cross-section of filesystem people.

The thing is, we're currently conflating "don't use the page cache" and
"don't bounce", and we really shouldn't be.

Sometimes performance is your main criteria and you do want the error if
you misaligned the IO, but not always. The the main reason being that
mixing buffered and O_DIRECT is _really_ expensive, and you will quite
often have codepaths that don't care about performance, but do need to
use O_DIRECT so they don't interfere with your paths that do care about
performance.

So they're forced to deal with bouncing and alignment restrictions at
the applicatoin level, even though they really don't want to be, and we
(at least bcachefs) already have the code to do that bouncing when
required - there's really no reason for them to be hand rolling that,
and if writes are involved then it gets really nontrivial.

This is one of the reasons I'll likely be implementing sub-blocksize
writes through the low level write path in bcachefs. O_DIRECT gives you
things that buffered IO doesn't - it's much cheaper, and simpler to
guarantee write ordering of O_DIRECT writes (c.f. untorn writes,
O_DIRECT is much easier than buffered there as well, and it's even
easier in a COW filesystem) - and it's going to be worthwhile to be able
to provide that behaviour without the alignment restrictions that
currently come with O_DIRECT.

To sum all that up: "don't bounce" should really another rwf flag we can
pass to preadv2/pwritev2 - RWF_NOBOUNCE.

And for bcachefs I would implement bouncing-when-required even without
that flag being available, simply because that's current behaviour (due
to checksums being extent granularity). That flag really only makes
sense for bcachefs once we get opt-in block granular checksums (which is
coming, as well).




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux