Re: [RFC PATCH] btrfs: don't call btrfs_sync_file from iomap context

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 17 Sep 2020 16:29:23 +1000

On Thu, Sep 17, 2020 at 07:52:32AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 17, 2020 at 01:09:42PM +1000, Dave Chinner wrote:
> > > > iomap_dio_complete()
> > > >   generic_write_sync()
> > > >     btrfs_file_fsync()
> > > >       inode_lock()
> > > >       <deadlock>
> > > 
> > > Can inode_dio_end() be called before generic_write_sync(), as it is done
> > > in fs/direct-io.c:dio_complete()?
> > 
> > Don't think so.  inode_dio_wait() is supposed to indicate that all
> > DIO is complete, and having the "make it stable" parts of an O_DSYNC
> > DIO still running after inode_dio_wait() returns means that we still
> > have DIO running....
> > 
> > For some filesystems, ensuring the DIO data is stable may involve
> > flushing other data (perhaps we did EOF zeroing before the file
> > extending DIO) and/or metadata to the log, so we need to guarantee
> > these DIO related operations are complete and stable before we say
> > the DIO is done.
> 
> inode_dio_wait really just waits for active I/O that writes to or reads
> from the file.  It does not imply that the I/O is stable, just like
> i_rwsem itself doesn't.

No, but iomap_dio_rw() considers a O_DSYNC write to be incomplete
until it is stable so that it presents consistent behaviour to
anythign calling inode_dio_wait().

> Various file systems have historically called
> the syncing outside i_rwsem and inode_dio_wait (in fact that is what the
> fs/direct-io.c code does, so XFS did as well until a few years ago), and
> that isn't a problem at all - we just can't return to userspace (or call
> ki_complete for in-kernel users) before the data is stable on disk.

I'm really not caring about userspace here - we use inode_dio_wait()
as an IO completion notification for the purposes of synchronising
internal filesystem state before modifying user data via direct
metadata manipulation. Hence I want sane, consistent, predictable IO
completion notification behaviour regardless of the implementation
path it goes through.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx