Re: [PATCH] xfs: Remove i_rwsem lock in buffered read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 18, 2025 at 02:03:41PM +0100, Amir Goldstein wrote:
> On Fri, Jan 17, 2025 at 11:19 PM Dave Chinner
> <david@xxxxxxxxxxxxx> wrote:
> > On Fri, Jan 17, 2025 at 02:27:46PM +0100, Amir Goldstein wrote:
> > > On Wed, Jan 15, 2025 at 10:41 PM Dave Chinner
> > > <david@xxxxxxxxxxxxx> wrote: For all practical purposes, we
> > > could maintain a counter in inode not for submitted DIO, but
> > > for files opened O_DIRECT.
> > >
> > > The first open O_DIRECT could serialize with in-flight
> > > buffered writes holding a shared iolock and then buffered
> > > writes could take SH vs. EX iolock depending on folio state
> > > and on i_dio_open_count.
> >
> > I don't see how this can be made to work w.r.t. sane data
> > coherency behaviour. e.g. how does this model serialise a new
> > DIO write that is submitted after a share-locked buffered write
> > has just started and passed all the "i_dio_count == 0" checks
> > that enable it to use shared locking? i.e. we now have
> > potentially overlapping buffered writes and DIO writes being
> > done concurrently because the buffered write may not have
> > instantiated folios in the page cache yet....
> >
> 
> A shared-locked buffered write can only start when there is no
> O_DIRECT file open on the inode.  The first open for O_DIRECT to
> increment i_dio_open_count needs to take exclusive iolock to wait
> for all in-flight buffered writes then release the iolock.
> 
> All DIO submitted via O_DIRECT fds will be safe against in-flight
> share-locked buffered write.

Hooking ->open() and ->release() isn't sufficient. O_DIRECT can be
changed at any time via fcntl(F_SETFL) which isn't synchronised
against IO in progress at all. i.e. we can't track that, nor can we
even use f_ops->check_flags to reject changes to O_DIRECT state
because it doesn't pass either the filp or the inode....

IOWs, as it stands logic like "is the file opened for O_DIRECT" in
the IO path is inherently racy and the racing mechanisms are
directly under the control of unprivileged userspace applications.
Without mods from the VFS down and new hooks being implemented in
XFS along with all the whacky "which serialisiation method do we
use" logic and dealing with the complexities and switching
between them, tracking struct files that are open for O_DIRECT isn't
really a viable mechanism for enabling shared buffered writes...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux