Re: [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback

Josef Bacik <jbacik@xxxxxx> · Tue, 7 Mar 2017 02:37:40 +0000

> On Mar 6, 2017, at 8:23 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
>> On Mon, Mar 06, 2017 at 02:00:13PM -0500, Josef Bacik wrote:
>> so you retry once, and the second time through we will return false if
>> we are unmounting, which means that xfs_buf_do_callbacks gets called.
>> 
>> Now you say that xfs_buf_do_callbacks() being called when we errored
>> out is dangerous, but I don't understand why.  We have this code
>> 
>>         if (need_ail) {
>>                 struct xfs_log_item *log_items[need_ail];
>>                 int i = 0;
>>                 spin_lock(&ailp->xa_lock);
>>                 for (blip = lip; blip; blip = blip->li_bio_list) {
>>                         iip = INODE_ITEM(blip);
>>                         if (iip->ili_logged &&
>>                             blip->li_lsn == iip->ili_flush_lsn) {
>>                                 log_items[i++] = blip;
>>                         }
>>                         ASSERT(i <= need_ail);
>>                 }
>>                 /* xfs_trans_ail_delete_bulk() drops the AIL lock. */
>>                 xfs_trans_ail_delete_bulk(ailp, log_items, i,
>>                                           SHUTDOWN_CORRUPT_INCORE);
>>         }
> 
> [ Your cut-n-paste is horribly mangled in something non-ascii. ]
> 

Yeah I keep trying new mail clients and it's just not going well.

>> and need_ail is set if we didn't actually end up on disk properly.  So
>> we delete the thing and mark the fs as fucked.  Am I mis-reading what
>> is going on here?
> 
> Yes. That won't shut down the filesystem on an inode write error. It
> will simply remove the inode from the AIL, the error will do dropped
> on the floor, and the filesystem will not be shut down even though
> it's now in a corrupt state. xfs_trans_ail_delete_bulk() only shuts
> down the filesytem if it doesn't find the item being removed form
> the AIL in the AIL.
> 
> As I keep telling people, what needs to happen is this:
> 
>    1. buffer write error occurs
>    2. retries exhausted, mark buffer as errored, run callbacks
>    3. xfs_iflush_done() (and other iodone callbacks) need to
>       look at the buffer error state to determine what to do.
>    4. if the buffer is errored and recovery is not possible,
>       the callback needs to shut down the filesytem.
>    5. the callback then needs to call flush abort function for
>       item (e.g. xfs_iflush_abort()) rather than the normal
>       IO completion processing path.
> 
>>  This is what I did to our 4.6 tree (in testing
>> still, not actually pushed to production)
>> 
>>         if (bp->b_flags & XBF_ASYNC) {
>>                 ASSERT(bp->b_iodone != NULL);
>> 
>>                 trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
>> 
>>                 xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag
>> */
>> 
>>                 if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) {
>>                         bp->b_flags |= XBF_WRITE | XBF_ASYNC |
>>                                        XBF_DONE | XBF_WRITE_FAIL;
>>                         xfs_buf_submit(bp);
>>                 } else {
>> +                       xfs_buf_do_callbacks(bp);
>> +                       bp->b_fspriv = NULL;
>> +                       bp->b_iodone = NULL;
>>                         xfs_buf_relse(bp);
>>                 }
>> 
>>                 return;
>>         }
> 
> It's likely you'll end up with silent on disk corruption if you do
> that because it's just tossing the error state away.

Ok great this is what I needed, thanks Dave,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html