On Mon, Oct 19, 2020 at 09:29:17AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > If processing recovered log intent items fails, we need to cancel all > the unprocessed recovered items immediately so that a subsequent AIL > push in the bail out path won't get wedged on the pinned intent items > that didn't get processed. > > This can happen if the log contains (1) an intent that gets and releases > an inode, (2) an intent that cannot be recovered successfully, and (3) > some third intent item. When recovery of (2) fails, we leave (3) pinned > in memory. Inode reclamation is called in the error-out path of > xfs_mountfs before xfs_log_cancel_mount. Reclamation calls > xfs_ail_push_all_sync, which gets stuck waiting for (3). > > Therefore, call xlog_recover_cancel_intents if _process_intents fails. > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > --- Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > fs/xfs/xfs_log_recover.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index a8289adc1b29..87886b7f77da 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -3446,6 +3446,14 @@ xlog_recover_finish( > int error; > error = xlog_recover_process_intents(log); > if (error) { > + /* > + * Cancel all the unprocessed intent items now so that > + * we don't leave them pinned in the AIL. This can > + * cause the AIL to livelock on the pinned item if > + * anyone tries to push the AIL (inode reclaim does > + * this) before we get around to xfs_log_mount_cancel. > + */ > + xlog_recover_cancel_intents(log); > xfs_alert(log->l_mp, "Failed to recover intents"); > return error; > } >