On Tue, Oct 29, 2019 at 04:33:37PM -0700, Darrick J. Wong wrote: > On Wed, Oct 30, 2019 at 09:37:52AM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > AIO+DIO can extend the file size on IO completion, and it holds > > no inode locks while the IO is in flight. Therefore, a race > > condition exists in file size updates if we do something like this: > > > > aio-thread fallocate-thread > > > > lock inode > > submit IO beyond inode->i_size > > unlock inode > > ..... > > lock inode > > break layouts > > if (off + len > inode->i_size) > > new_size = off + len > > ..... > > inode_dio_wait() > > <blocks> > > ..... > > completes > > inode->i_size updated > > inode_dio_done() > > .... > > <wakes> > > <does stuff no long beyond EOF> > > if (new_size) > > xfs_vn_setattr(inode, new_size) > > > > > > Yup, that attempt to extend the file size in the fallocate code > > turns into a truncate - it removes the whatever the aio write > > allocated and put to disk, and reduced the inode size back down to > > where the fallocate operation ends. > > > > Fundamentally, xfs_file_fallocate() not compatible with racing > > AIO+DIO completions, so we need to move the inode_dio_wait() call > > up to where the lock the inode and break the layouts. > > > > Secondly, storing the inode size and then using it unchecked without > > holding the ILOCK is not safe; we can only do such a thing if we've > > locked out and drained all IO and other modification operations, > > which we don't do initially in xfs_file_fallocate. > > > > It should be noted that some of the fallocate operations are > > compound operations - they are made up of multiple manipulations > > that may zero data, and so we may need to flush and invalidate the > > file multiple times during an operation. However, we only need to > > lock out IO and other space manipulation operations once, as that > > lockout is maintained until the entire fallocate operation has been > > completed. > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > Looks reasonable to me; what do you think of my regression test? Looks reasonable at a first glance. Not much different what I was using to test this patch. I haven't looked in more detail than that yet... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx