On Tue, Sep 15, 2020 at 04:48:53PM -0500, Goldwyn Rodrigues wrote: > On 10:04 07/09, Dave Chinner wrote: > > On Thu, Sep 03, 2020 at 06:32:36PM +0200, Christoph Hellwig wrote: > > > We could trivially do something like this to allow the file system > > > to call iomap_dio_complete without i_rwsem: > > > > That just exposes another deadlock vector: > > > > P0 P1 > > inode_lock() fallocate(FALLOC_FL_ZERO_RANGE) > > __iomap_dio_rw() inode_lock() > > <block> > > <submits IO> > > <completes IO> > > inode_unlock() > > <gets inode_lock()> > > inode_dio_wait() > > iomap_dio_complete() > > generic_write_sync() > > btrfs_file_fsync() > > inode_lock() > > <deadlock> > > Can inode_dio_end() be called before generic_write_sync(), as it is done > in fs/direct-io.c:dio_complete()? Don't think so. inode_dio_wait() is supposed to indicate that all DIO is complete, and having the "make it stable" parts of an O_DSYNC DIO still running after inode_dio_wait() returns means that we still have DIO running.... For some filesystems, ensuring the DIO data is stable may involve flushing other data (perhaps we did EOF zeroing before the file extending DIO) and/or metadata to the log, so we need to guarantee these DIO related operations are complete and stable before we say the DIO is done. > Christoph's solution is a clean approach and would prefer to use it as > the final solution. /me shrugs Christoph's solution simply means you can't use inode_dio_wait() in the filesystem. btrfs would need its own DIO barrier.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx