Re: [PATCH 03/12] xfs: always attach iflush_done and simplify error handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 17, 2020 at 11:08:50AM -0400, Brian Foster wrote:
> The inode flush code has several layers of error handling between
> the inode and cluster flushing code. If the inode flush fails before
> acquiring the backing buffer, the inode flush is aborted. If the
> cluster flush fails, the current inode flush is aborted and the
> cluster buffer is failed to handle the initial inode and any others
> that might have been attached before the error.
> 
> Since xfs_iflush() is the only caller of xfs_iflush_cluser(), the

xfs_iflush_cluster()

> error handling between the two can be condensed in the top-level
> function. If we update xfs_iflush_int() to attach the item
> completion handler to the buffer first, any errors that occur after
> the first call to xfs_iflush_int() can be handled with a buffer
> I/O failure.
> 
> Lift the error handling from xfs_iflush_cluster() into xfs_iflush()
> and consolidate with the existing error handling. This also replaces
> the need to release the buffer because failing the buffer with
> XBF_ASYNC drops the current reference.

Yeah, that makes sense. I've lifted the cluster flush error handling
into the callers, even though xfs_iflush() has gone away.

However...

> @@ -3798,6 +3765,13 @@ xfs_iflush_int(
>  	       ip->i_d.di_nextents > XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK));
>  	ASSERT(iip != NULL && iip->ili_fields != 0);
>  
> +	/*
> +	 * Attach the inode item callback to the buffer. Whether the flush
> +	 * succeeds or not, buffer I/O completion processing is now required to
> +	 * remove the inode from the AIL and release the flush lock.
> +	 */
> +	xfs_buf_attach_iodone(bp, xfs_iflush_done, &iip->ili_item);
> +
>  	/* set *dip = inode's place in the buffer */
>  	dip = xfs_buf_offset(bp, ip->i_imap.im_boffset);

...I'm not convinced this is a valid thing to do at this point. The
inode item has not been set up yet with the correct state that is
associated with the flushing of the inode (e.g. lsn, last_flags,
etc) and so this kinda leaves a landmine in the item IO completion
processing in that failure cannot rely on any of the inode log item
state to make condition decisions.

While it's technically not wrong, it just makes me uneasy, as in
future the flush abort code will have to be careful about using
inode state in making decisions, and there's not comments in the
abort code to indicate that the state may be invalid...

/me has chased several subtle issues through this code recently...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux