Re: [PATCH v2 03/13] xfs: fallthru to buffer attach on error and simplify error handling

Brian Foster <bfoster@xxxxxxxxxx> · Thu, 23 Apr 2020 10:28:59 -0400

On Thu, Apr 23, 2020 at 02:18:23PM +1000, Dave Chinner wrote:
> On Wed, Apr 22, 2020 at 01:54:19PM -0400, Brian Foster wrote:
> > The inode flush code has several layers of error handling between
> > the inode and cluster flushing code. If the inode flush fails before
> > acquiring the backing buffer, the inode flush is aborted. If the
> > cluster flush fails, the current inode flush is aborted and the
> > cluster buffer is failed to handle the initial inode and any others
> > that might have been attached before the error.
> > 
> > Since xfs_iflush() is the only caller of xfs_iflush_cluster(), the
> > error handling between the two can be condensed in the top-level
> > function. If we update xfs_iflush_int() to always fall through to
> > the log item update and attach the item completion handler to the
> > buffer, any errors that occur after the first call to
> > xfs_iflush_int() can be handled with a buffer I/O failure.
> > 
> > Lift the error handling from xfs_iflush_cluster() into xfs_iflush()
> > and consolidate with the existing error handling. This also replaces
> > the need to release the buffer because failing the buffer with
> > XBF_ASYNC drops the current reference.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> 
> Needs a better subject line, because I had no idea what it meant
> until I got to the last hunks in the patch.  Perhaps: "Simplify
> inode flush error handling" would be a better summary of the
> patch....
> 

Ok, fixed.

> > @@ -3791,6 +3758,7 @@ xfs_iflush_int(
> >  	struct xfs_inode_log_item *iip = ip->i_itemp;
> >  	struct xfs_dinode	*dip;
> >  	struct xfs_mount	*mp = ip->i_mount;
> > +	int			error;
> 
> There needs to be a comment added to this function to explain why we
> always attached the inode to the buffer and update the flush state,
> even on error. This:
> 

Indeed. Updated as follows with a comment before the first corruption
check:

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 6cdb9fbe2d89..6b8266eeae2d 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3766,9 +3766,14 @@ xfs_iflush_int(
 	       ip->i_d.di_nextents > XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK));
 	ASSERT(iip != NULL && iip->ili_fields != 0);
 
-	/* set *dip = inode's place in the buffer */
 	dip = xfs_buf_offset(bp, ip->i_imap.im_boffset);
 
+	/*
+	 * We don't flush the inode if any of the following checks fail, but we
+	 * do still update the log item and attach to the backing buffer as if
+	 * the flush happened. This is a formality to facilitate predictable
+	 * error handling as the caller will shutdown and fail the buffer.
+	 */
 	error = -EFSCORRUPTED;
 	if (XFS_TEST_ERROR(dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC),
 			       mp, XFS_ERRTAG_IFLUSH_1)) {

Brian

> > @@ -3914,10 +3885,10 @@ xfs_iflush_int(
> >  				&iip->ili_item.li_lsn);
> >  
> >  	/*
> > -	 * Attach the function xfs_iflush_done to the inode's
> > -	 * buffer.  This will remove the inode from the AIL
> > -	 * and unlock the inode's flush lock when the inode is
> > -	 * completely written to disk.
> > +	 * Attach the inode item callback to the buffer whether the flush
> > +	 * succeeded or not. If not, the caller will shut down and fail I/O
> > +	 * completion on the buffer to remove the inode from the AIL and release
> > +	 * the flush lock.
> >  	 */
> >  	xfs_buf_attach_iodone(bp, xfs_iflush_done, &iip->ili_item);
> 
> isn't obviously associated with the "flush_out" label, and so the
> structure of the function really isn't explained until you get to
> the end of the function. And that's still easy to miss...
> 
> Other than that, the code looks OK.
> 
> CHeers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
>