Re: bvec_iter.bi_sector -> loff_t? (was: Re: [PATCH] bcachefs: allow direct io fallback to buffer io for) unaligned length or offset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 20, 2024 at 09:36:42AM -0400, Kent Overstreet wrote:
> On Thu, Jun 20, 2024 at 09:21:57PM +0800, Hongbo Li wrote:
> > Support fallback to buffered I/O if the operation being performed on
> > unaligned length or offset. This may change the behavior for direct
> > I/O in some cases.
> > 
> > [Before]
> > For length which aligned with 256 bytes (not SECTOR aligned) will
> > read failed under direct I/O.
> > 
> > [After]
> > For length which aligned with 256 bytes (not SECTOR aligned) will
> > read the data successfully under direct I/O because it will fallback
> > to buffer I/O.

This is against the O_DIRECT requirements.

   O_DIRECT
       The O_DIRECT flag may impose alignment restrictions on  the  length  and
       address  of  user-space  buffers  and the file offset of I/Os.  In Linux
       alignment restrictions vary by filesystem and kernel version  and  might
       be  absent  entirely.   The  handling  of  misaligned O_DIRECT I/Os also
       varies; they can either fail with EINVAL or fall back to buffered I/O.

       Since Linux 6.1, O_DIRECT support and alignment restrictions for a  file
       can  be  queried using statx(2), using the STATX_DIOALIGN flag.  Support
       for STATX_DIOALIGN varies by filesystem; see statx(2).

       Some filesystems provide their  own  interfaces  for  querying  O_DIRECT
       alignment restrictions, for example the XFS_IOC_DIOINFO operation in xf‐
       sctl(3).  STATX_DIOALIGN should be used instead when it is available.

       If none of the above is available, then direct I/O support and alignment
       restrictions  can  only  be  assumed  from  known characteristics of the
       filesystem, the individual file, the underlying storage  device(s),  and
       the  kernel  version.  In Linux 2.4, most filesystems based on block de‐
       vices require that the file offset and the length and memory address  of
       all  I/O  segments  be multiples of the filesystem block size (typically
       4096 bytes).  In Linux 2.6.0, this was relaxed to the logical block size
       of the block device (typically 512 bytes).   A  block  device's  logical
       block  size  can be determined using the ioctl(2) BLKSSZGET operation or
       from the shell using the command:

           blockdev --getss

> The catch is that struct bio - bvec_iter - represents addresses with a
> sector_t, and we'd want that to be a loff_t.
> 
> That's something we should do anyways; everything else in struct bio can
> represent a byte-aligned io, bvec_iter.bi_sector is the only exception
> and fixing that would help in consolidating our various scatter-gather
> list data structures - but we'd need buy-in from Jens and Christoph
> before doing that.

I'm against it.  Block devices only do sector-aligned IO and we should
not pretend otherwise.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux