Re: [PATCH 3/6] xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 24, 2022 at 11:21:00AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Brian reported a null pointer dereference failure during unmount in
> xfs/006. He tracked the problem down to the AIL being torn down
> before a log shutdown had completed and removed all the items from
> the AIL. The failure occurred in this path while unmount was
> proceeding in another task:
> 
>  xfs_trans_ail_delete+0x102/0x130 [xfs]
>  xfs_buf_item_done+0x22/0x30 [xfs]
>  xfs_buf_ioend+0x73/0x4d0 [xfs]
>  xfs_trans_committed_bulk+0x17e/0x2f0 [xfs]
>  xlog_cil_committed+0x2a9/0x300 [xfs]
>  xlog_cil_process_committed+0x69/0x80 [xfs]
>  xlog_state_shutdown_callbacks+0xce/0xf0 [xfs]
>  xlog_force_shutdown+0xdf/0x150 [xfs]
>  xfs_do_force_shutdown+0x5f/0x150 [xfs]
>  xlog_ioend_work+0x71/0x80 [xfs]
>  process_one_work+0x1c5/0x390
>  worker_thread+0x30/0x350
>  kthread+0xd7/0x100
>  ret_from_fork+0x1f/0x30
> 
> This is processing an EIO error to a log write, and it's
> triggering a force shutdown. This causes the log to be shut down,
> and then it is running attached iclog callbacks from the shutdown
> context. That means the fs and log has already been marked as
> xfs_is_shutdown/xlog_is_shutdown and so high level code will abort
> (e.g. xfs_trans_commit(), xfs_log_force(), etc) with an error
> because of shutdown.
> 
> The umount would have been blocked waiting for a log force
> completion inside xfs_log_cover() -> xfs_sync_sb(). The first thing
> for this situation to occur is for xfs_sync_sb() to exit without
> waiting for the iclog buffer to be comitted to disk. The
> above trace is the completion routine for the iclog buffer, and
> it is shutting down the filesystem.
> 
> xlog_state_shutdown_callbacks() does this:
> 
> {
>         struct xlog_in_core     *iclog;
>         LIST_HEAD(cb_list);
> 
>         spin_lock(&log->l_icloglock);
>         iclog = log->l_iclog;
>         do {
>                 if (atomic_read(&iclog->ic_refcnt)) {
>                         /* Reference holder will re-run iclog callbacks. */
>                         continue;
>                 }
>                 list_splice_init(&iclog->ic_callbacks, &cb_list);
> >>>>>>           wake_up_all(&iclog->ic_write_wait);
> >>>>>>           wake_up_all(&iclog->ic_force_wait);
>         } while ((iclog = iclog->ic_next) != log->l_iclog);
> 
>         wake_up_all(&log->l_flush_wait);
>         spin_unlock(&log->l_icloglock);
> 
> >>>>>>  xlog_cil_process_committed(&cb_list);
> }
> 
> It wakes forces waiters before shutdown processes all the pending
> callbacks.

I'm not sure what this means.

Are you saying that log shutdown wakes up iclog waiters before it
processes pending callbacks?  And then anyone who waits on an iclog (log
forces, I guess?) will wake up and race with the callbacks?

> That means the xfs_sync_sb() waiting on a sync
> transaction in xfs_log_force() on iclog->ic_force_wait will get
> woken before the callbacks attached to that iclog are run. This
> results in xfs_sync_sb() returning an error, and so unmount unblocks
> and continues to run whilst the log shutdown is still in progress.
> 
> Normally this is just fine because the force waiter has nothing to
> do with AIL operations. But in the case of this unmount path, the
> log force waiter goes on to tear down the AIL because the log is now
> shut down and so nothing ever blocks it again from the wait point in
> xfs_log_cover().
> 
> Hence it's a race to see who gets to the AIL first - the unmount
> code or xlog_cil_process_committed() killing the superblock buffer.
> 
> To fix this, we just have to change the order of processing in
> xlog_state_shutdown_callbacks() to run the callbacks before it wakes
> any task waiting on completion of the iclog.

Hmm.  I think my guess above is correct, then.  I /think/ the code looks
ok based on my above guess at comprehension.

--D

> 
> Reported-by: Brian Foster <bfoster@xxxxxxxxxx>
> Fixes: aad7272a9208 ("xfs: separate out log shutdown callback processing")
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_log.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index fbf1b08b698c..388d53df7620 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -487,7 +487,10 @@ xfs_log_reserve(
>   * Run all the pending iclog callbacks and wake log force waiters and iclog
>   * space waiters so they can process the newly set shutdown state. We really
>   * don't care what order we process callbacks here because the log is shut down
> - * and so state cannot change on disk anymore.
> + * and so state cannot change on disk anymore. However, we cannot wake waiters
> + * until the callbacks have been processed because we may be in unmount and
> + * we must ensure that all AIL operations the callbacks perform have completed
> + * before we tear down the AIL.
>   *
>   * We avoid processing actively referenced iclogs so that we don't run callbacks
>   * while the iclog owner might still be preparing the iclog for IO submssion.
> @@ -501,7 +504,6 @@ xlog_state_shutdown_callbacks(
>  	struct xlog_in_core	*iclog;
>  	LIST_HEAD(cb_list);
>  
> -	spin_lock(&log->l_icloglock);
>  	iclog = log->l_iclog;
>  	do {
>  		if (atomic_read(&iclog->ic_refcnt)) {
> @@ -509,14 +511,16 @@ xlog_state_shutdown_callbacks(
>  			continue;
>  		}
>  		list_splice_init(&iclog->ic_callbacks, &cb_list);
> +		spin_unlock(&log->l_icloglock);
> +
> +		xlog_cil_process_committed(&cb_list);
> +
> +		spin_lock(&log->l_icloglock);
>  		wake_up_all(&iclog->ic_write_wait);
>  		wake_up_all(&iclog->ic_force_wait);
>  	} while ((iclog = iclog->ic_next) != log->l_iclog);
>  
>  	wake_up_all(&log->l_flush_wait);
> -	spin_unlock(&log->l_icloglock);
> -
> -	xlog_cil_process_committed(&cb_list);
>  }
>  
>  /*
> @@ -571,11 +575,8 @@ xlog_state_release_iclog(
>  		 * pending iclog callbacks that were waiting on the release of
>  		 * this iclog.
>  		 */
> -		if (last_ref) {
> -			spin_unlock(&log->l_icloglock);
> +		if (last_ref)
>  			xlog_state_shutdown_callbacks(log);
> -			spin_lock(&log->l_icloglock);
> -		}
>  		return -EIO;
>  	}
>  
> @@ -3889,7 +3890,10 @@ xlog_force_shutdown(
>  	wake_up_all(&log->l_cilp->xc_start_wait);
>  	wake_up_all(&log->l_cilp->xc_commit_wait);
>  	spin_unlock(&log->l_cilp->xc_push_lock);
> +
> +	spin_lock(&log->l_icloglock);
>  	xlog_state_shutdown_callbacks(log);
> +	spin_unlock(&log->l_icloglock);
>  
>  	return log_error;
>  }
> -- 
> 2.35.1
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux