Re: [PATCH] fs: Add a new flag RWF_IOWAIT for preadv2(2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 7, 2024 at 5:52 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> On Tue, Aug 06, 2024 at 10:05:50PM +0800, Yafang Shao wrote:
> > On Tue, Aug 6, 2024 at 9:24 PM Jan Kara <jack@xxxxxxx> wrote:
> > > On Tue 06-08-24 19:54:58, Yafang Shao wrote:
> > > > Its guarantee is clear:
> > > >
> > > >   : I/O is intended to be atomic to ordinary files and pipes and FIFOs.
> > > >   : Atomic means that all the bytes from a single operation that started
> > > >   : out together end up together, without interleaving from other I/O
> > > >   : operations.
> > >
> > > Oh, I understand why XFS does locking this way and I'm well aware this is
> > > a requirement in POSIX. However, as you have experienced, it has a
> > > significant performance cost for certain workloads (at least with simple
> > > locking protocol we have now) and history shows users rather want the extra
> > > performance at the cost of being a bit more careful in userspace. So I
> > > don't see any filesystem switching to XFS behavior until we have a
> > > performant range locking primitive.
> > >
> > > > What this flag does is avoid waiting for this type of lock if it
> > > > exists. Maybe we should consider a more descriptive name like
> > > > RWF_NOATOMICWAIT, RWF_NOFSLOCK, or RWF_NOPOSIXWAIT? Naming is always
> > > > challenging.
> > >
> > > Aha, OK. So you want the flag to mean "I don't care about POSIX read-write
> > > exclusion". I'm still not convinced the flag is a great idea but
> > > RWF_NOWRITEEXCLUSION could perhaps better describe the intent of the flag.
> >
> > That's better. Should we proceed with implementing this new flag? It
> > provides users with an option to avoid this type of issue.
>
> No. If we are going to add a flag like that, the fix to XFS isn't to
> use IOCB_NOWAIT on reads, it's to use shared locking for buffered
> writes just like we do for direct IO.
>
> IOWs, this flag would be needed on -writes-, not reads, and at that
> point we may as well just change XFS to do shared buffered writes
> for -everyone- so it is consistent with all other Linux filesystems.
>
> Indeed, last time Amir brought this up, I suggested that shared
> buffered write locking in XFS was the simplest way forward. Given
> that we use large folios now, small IOs get mapped to a single folio
> and so will still have the same write vs overlapping write exclusion
> behaviour most all the time.
>
> However, since then we've moved to using shared IO locking for
> cloning files. A clone does not modify data, so read IO is allowed
> during the clone. If we move writes to use shared locking, this
> breaks file cloning. We would have to move cloning back to to using
> exclusive locking, and that's going to cause performance and IO
> latency regressions for applications using clones with concurrent IO
> (e.g. VM image snapshots in cloud infrastruction).
>
> Hence the only viable solution to all these different competing "we
> need exclusive access to a range of the file whilst allowing other
> concurrent IO" issues is to move to range locking for IO
> exclusion....

The initial post you mentioned about range locking dates back to 2019,
five years ago. Now, five years have passed, and nothing has happened.

In 2029, five years later, someone else might encounter this issue
again, and the response will be the same: "let's try range locking."

And then another five years will pass...

So, "range locking == Do nothing." I'm not saying it's your
responsibility to implement range locking, but it seems no one else is
capable of implementing this complex feature except you.

RWF_NOWAIT was initially introduced for AIO in commit b745fafaf70c
("fs: Introduce RWF_NOWAIT and FMODE_AIO_NOWAIT") with a clear
definition that it shouldn't "block while allocating requests while
performing direct I/O."
It was then extended to buffered IO in commit 91f9943e1c7b ("fs:
support RWF_NOWAIT for buffered reads"), where the IOCB_NOIO was not
set, meaning it would perform read IO if there was no page cache.
Readahead support was added for this flag in commit 2e85abf053b9 ("mm:
allow read-ahead with IOCB_NOWAIT set"). However, this behavior
changed in commit efa8480a8316 ("fs: RWF_NOWAIT should imply
IOCB_NOIO"), without a clear use case, simply stating that "RWF_NOWAIT
semantics of only doing cached reads." If it breaks the "RWF_NOWAIT
semantics," why not introduce a new flag for this new semantics where
non-cached reads are allowed?

-- 
Regards
Yafang





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux