> On Mar 6, 2017, at 8:23 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > >> On Mon, Mar 06, 2017 at 02:00:13PM -0500, Josef Bacik wrote: >> so you retry once, and the second time through we will return false if >> we are unmounting, which means that xfs_buf_do_callbacks gets called. >> >> Now you say that xfs_buf_do_callbacks() being called when we errored >> out is dangerous, but I don't understand why. We have this code >> >> if (need_ail) { >> struct xfs_log_item *log_items[need_ail]; >> int i = 0; >> spin_lock(&ailp->xa_lock); >> for (blip = lip; blip; blip = blip->li_bio_list) { >> iip = INODE_ITEM(blip); >> if (iip->ili_logged && >> blip->li_lsn == iip->ili_flush_lsn) { >> log_items[i++] = blip; >> } >> ASSERT(i <= need_ail); >> } >> /* xfs_trans_ail_delete_bulk() drops the AIL lock. */ >> xfs_trans_ail_delete_bulk(ailp, log_items, i, >> SHUTDOWN_CORRUPT_INCORE); >> } > > [ Your cut-n-paste is horribly mangled in something non-ascii. ] > Yeah I keep trying new mail clients and it's just not going well. >> and need_ail is set if we didn't actually end up on disk properly. So >> we delete the thing and mark the fs as fucked. Am I mis-reading what >> is going on here? > > Yes. That won't shut down the filesystem on an inode write error. It > will simply remove the inode from the AIL, the error will do dropped > on the floor, and the filesystem will not be shut down even though > it's now in a corrupt state. xfs_trans_ail_delete_bulk() only shuts > down the filesytem if it doesn't find the item being removed form > the AIL in the AIL. > > As I keep telling people, what needs to happen is this: > > 1. buffer write error occurs > 2. retries exhausted, mark buffer as errored, run callbacks > 3. xfs_iflush_done() (and other iodone callbacks) need to > look at the buffer error state to determine what to do. > 4. if the buffer is errored and recovery is not possible, > the callback needs to shut down the filesytem. > 5. the callback then needs to call flush abort function for > item (e.g. xfs_iflush_abort()) rather than the normal > IO completion processing path. > >> This is what I did to our 4.6 tree (in testing >> still, not actually pushed to production) >> >> if (bp->b_flags & XBF_ASYNC) { >> ASSERT(bp->b_iodone != NULL); >> >> trace_xfs_buf_item_iodone_async(bp, _RET_IP_); >> >> xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag >> */ >> >> if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) { >> bp->b_flags |= XBF_WRITE | XBF_ASYNC | >> XBF_DONE | XBF_WRITE_FAIL; >> xfs_buf_submit(bp); >> } else { >> + xfs_buf_do_callbacks(bp); >> + bp->b_fspriv = NULL; >> + bp->b_iodone = NULL; >> xfs_buf_relse(bp); >> } >> >> return; >> } > > It's likely you'll end up with silent on disk corruption if you do > that because it's just tossing the error state away. Ok great this is what I needed, thanks Dave, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html