Re: [PATCH 7/9] xfs: check XFS_EOFBLOCKS_RELEASED earlier in xfs_release_eofblocks

Christoph Hellwig <hch@xxxxxx> · Sun, 11 Aug 2024 10:59:52 +0200

On Fri, Aug 09, 2024 at 09:03:24AM +1000, Dave Chinner wrote:
> The test and set here is racy. A long time can pass between the test
> and the setting of the flag,

The race window is much tighter due to the iolock, but if we really
care about the race here, the right fix for that is to keep a second
check for the XFS_EOFBLOCKS_RELEASED flag inside the iolock.

> so maybe this should be optimised to
> something like:
> 
> 	if (inode->i_nlink &&
> 	    (file->f_mode & FMODE_WRITE) &&
> 	    (!(ip->i_flags & XFS_EOFBLOCKS_RELEASED)) &&
> 	    xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
> 		if (xfs_can_free_eofblocks(ip) &&
> 		    !xfs_iflags_test_and_set(ip, XFS_EOFBLOCKS_RELEASED))
> 			xfs_free_eofblocks(ip);
> 		xfs_iunlock(ip, XFS_IOLOCK_EXCL);
> 	}

All these direct i_flags access actually are racy too (at least in
theory).  We'd probably be better off moving those over to the atomic
bitops and only using i_lock for any coordination beyond the actual
flags.  I'd rather not get into that here for now, even if it is a
worthwhile project for later.

> I do wonder, though - why do we need to hold the IOLOCK to call
> xfs_can_free_eofblocks()? The only thing that really needs
> serialisation is the xfS_bmapi_read() call, and that's done under
> the ILOCK not the IOLOCK. Sure, xfs_free_eofblocks() needs the
> IOLOCK because it's effectively a truncate w.r.t. extending writes,
> but races with extending writes while checking if we need to do that
> operation aren't really a big deal. Worst case is we take the
> lock and free the EOF blocks beyond the writes we raced with.
> 
> What am I missing here?

I think the prime part of the story is that xfs_can_free_eofblocks was
split out of xfs_free_eofblocks, which requires the iolock.  But I'm
not sure if some of the checks are a little racy without the iolock,
although I doubt it matter in practice as they are all optimizations.
I'd need to take a deeper look at this, so maybe it's worth a follow
on together with the changes in i_flags handling.