Re: [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback

Brian Foster <bfoster@xxxxxxxxxx> · Tue, 7 Mar 2017 11:12:46 -0500

On Tue, Mar 07, 2017 at 10:56:05AM -0500, Josef Bacik wrote:
> On Tue, 2017-03-07 at 12:22 +1100, Dave Chinner wrote:
> > On Mon, Mar 06, 2017 at 02:00:13PM -0500, Josef Bacik wrote:
> > > 
> > > so you retry once, and the second time through we will return false
> > > if
> > > we are unmounting, which means that xfs_buf_do_callbacks gets
> > > called.
> > > 
> > > Now you say that xfs_buf_do_callbacks() being called when we
> > > errored
> > > out is dangerous, but I don't understand why.  We have this code
> > > 
> > >         if (need_ail) {
> > >                 struct xfs_log_item *log_items[need_ail];
> > >                 int i = 0;
> > >                 spin_lock(&ailp->xa_lock);
> > >                 for (blip = lip; blip; blip = blip->li_bio_list) {
> > >                         iip = INODE_ITEM(blip);
> > >                         if (iip->ili_logged &&
> > >                             blip->li_lsn == iip->ili_flush_lsn) {
> > >                                 log_items[i++] = blip;
> > >                         }
> > >                         ASSERT(i <= need_ail);
> > >                 }
> > >                 /* xfs_trans_ail_delete_bulk() drops the AIL lock.
> > > */
> > >                 xfs_trans_ail_delete_bulk(ailp, log_items, i,
> > >                                           SHUTDOWN_CORRUPT_INCORE);
> > >         }
> > [ Your cut-n-paste is horribly mangled in something non-ascii. ]
> > 
> > > 
> > > and need_ail is set if we didn't actually end up on disk properly.
> > >  So
> > > we delete the thing and mark the fs as fucked.  Am I mis-reading
> > > what
> > > is going on here?
> > Yes. That won't shut down the filesystem on an inode write error. It
> > will simply remove the inode from the AIL, the error will do dropped
> > on the floor, and the filesystem will not be shut down even though
> > it's now in a corrupt state. xfs_trans_ail_delete_bulk() only shuts
> > down the filesytem if it doesn't find the item being removed form
> > the AIL in the AIL.
> > 
> > As I keep telling people, what needs to happen is this:
> > 
> > 	1. buffer write error occurs
> > 	2. retries exhausted, mark buffer as errored, run callbacks
> > 	3. xfs_iflush_done() (and other iodone callbacks) need to
> > 	   look at the buffer error state to determine what to do.
> > 	4. if the buffer is errored and recovery is not possible,
> > 	   the callback needs to shut down the filesytem.
> > 	5. the callback then needs to call flush abort function for
> > 	   item (e.g. xfs_iflush_abort()) rather than the normal
> > 	   IO completion processing path.
> > 
> > > 
> > >  This is what I did to our 4.6 tree (in testing
> > > still, not actually pushed to production)
> > > 
> > >         if (bp->b_flags & XBF_ASYNC) {
> > >                 ASSERT(bp->b_iodone != NULL);
> > > 
> > >                 trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> > > 
> > >                 xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the
> > > flag
> > > */
> > > 
> > >                 if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) {
> > >                         bp->b_flags |= XBF_WRITE | XBF_ASYNC
> > >                                        XBF_DONE | XBF_WRITE_FAIL;
> > >                         xfs_buf_submit(bp);
> > >                 } else {
> > > +                       xfs_buf_do_callbacks(bp);
> > > +                       bp->b_fspriv = NULL;
> > > +                       bp->b_iodone = NULL;
> > >                         xfs_buf_relse(bp);
> > >                 }
> > > 
> > >                 return;
> > >         }
> > It's likely you'll end up with silent on disk corruption if you do
> > that because it's just tossing the error state away.
> 
> I think I'm getting a handle on this, but I wonder if we don't have
> this problem currently?  We get an error on a buffer, we retry it and
> it fails again.  If we don't have XFS_ERR_RETRY_FOREVER set then we
> error out and do the normal callbacks as if nothing ever happened, and
> we get the silent corruption without shutting down the file system.
>  How is the code as it stands without XFS_ERR_RETRY_FOREVER safe?
>  Thanks,
> 

We shouldn't get into a situation where we run the callbacks when the
I/O has failed and the fs isn't shut down. IOWs, we do run the callbacks
only when the I/O succeeds, or it fails and the fs is shutdown as a
result (never when the I/O fails but we haven't shut down, which is the
problematic case the patch above introduces).

That tunable error handling logic (XFS_ERR_RETRY_FOREVER) basically
determines when an I/O error is considered permanent, which means
retries are futile and we have no choice but to shutdown the fs. E.g.,
if an error is deemed permanent, we jump to permanent_error:

        /*
         * Permanent error - we need to trigger a shutdown if we haven't already
         * to indicate that inconsistency will result from this action.
         */
permanent_error:
        xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
out_stale:
        xfs_buf_stale(bp);
        bp->b_flags |= XBF_DONE;
        trace_xfs_buf_error_relse(bp, _RET_IP_);
        return false;
}

... which shuts down the filesystem before we go any farther.

Brian

> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html