On Mon, Mar 28, 2022 at 04:05:31PM -0700, Darrick J. Wong wrote: > On Thu, Mar 24, 2022 at 11:21:00AM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > Brian reported a null pointer dereference failure during unmount in > > xfs/006. He tracked the problem down to the AIL being torn down > > before a log shutdown had completed and removed all the items from > > the AIL. The failure occurred in this path while unmount was > > proceeding in another task: > > > > xfs_trans_ail_delete+0x102/0x130 [xfs] > > xfs_buf_item_done+0x22/0x30 [xfs] > > xfs_buf_ioend+0x73/0x4d0 [xfs] > > xfs_trans_committed_bulk+0x17e/0x2f0 [xfs] > > xlog_cil_committed+0x2a9/0x300 [xfs] > > xlog_cil_process_committed+0x69/0x80 [xfs] > > xlog_state_shutdown_callbacks+0xce/0xf0 [xfs] > > xlog_force_shutdown+0xdf/0x150 [xfs] > > xfs_do_force_shutdown+0x5f/0x150 [xfs] > > xlog_ioend_work+0x71/0x80 [xfs] > > process_one_work+0x1c5/0x390 > > worker_thread+0x30/0x350 > > kthread+0xd7/0x100 > > ret_from_fork+0x1f/0x30 > > > > This is processing an EIO error to a log write, and it's > > triggering a force shutdown. This causes the log to be shut down, > > and then it is running attached iclog callbacks from the shutdown > > context. That means the fs and log has already been marked as > > xfs_is_shutdown/xlog_is_shutdown and so high level code will abort > > (e.g. xfs_trans_commit(), xfs_log_force(), etc) with an error > > because of shutdown. > > > > The umount would have been blocked waiting for a log force > > completion inside xfs_log_cover() -> xfs_sync_sb(). The first thing > > for this situation to occur is for xfs_sync_sb() to exit without > > waiting for the iclog buffer to be comitted to disk. The > > above trace is the completion routine for the iclog buffer, and > > it is shutting down the filesystem. > > > > xlog_state_shutdown_callbacks() does this: > > > > { > > struct xlog_in_core *iclog; > > LIST_HEAD(cb_list); > > > > spin_lock(&log->l_icloglock); > > iclog = log->l_iclog; > > do { > > if (atomic_read(&iclog->ic_refcnt)) { > > /* Reference holder will re-run iclog callbacks. */ > > continue; > > } > > list_splice_init(&iclog->ic_callbacks, &cb_list); > > >>>>>> wake_up_all(&iclog->ic_write_wait); > > >>>>>> wake_up_all(&iclog->ic_force_wait); > > } while ((iclog = iclog->ic_next) != log->l_iclog); > > > > wake_up_all(&log->l_flush_wait); > > spin_unlock(&log->l_icloglock); > > > > >>>>>> xlog_cil_process_committed(&cb_list); > > } > > > > It wakes forces waiters before shutdown processes all the pending > > callbacks. > > I'm not sure what this means. "It wakes force waiters" i.e. any process in xfs_log_force() waiting on iclog->ic_force_wait... > Are you saying that log shutdown wakes up iclog waiters before it > processes pending callbacks? And then anyone who waits on an iclog (log > forces, I guess?) will wake up and race with the callbacks? Yes, exactly that. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx