Re: [PATCH] xfs: fix deadlock when set label online

yangerkun <yangerkun@xxxxxxxxxxxxxxx> · Thu, 29 Jun 2023 19:55:10 +0800

在 2023/6/29 7:10, Dave Chinner 写道:
On Tue, Jun 27, 2023 at 04:42:41PM +0800, yangerkun wrote:

在 2023/6/27 5:45, Dave Chinner 写道:
On Mon, Jun 26, 2023 at 09:15:42PM +0800, yangerkun wrote:
From: yangerkun <yangerkun@xxxxxxxxxx>

Combine use of xfs_trans_hold and xfs_trans_set_sync in xfs_sync_sb_buf
can trigger a deadlock once shutdown happened concurrently. xlog_ioend_work
will first unpin the sb(which stuck with xfs_buf_lock), then wakeup
xfs_sync_sb_buf. However, xfs_sync_sb_buf never get the chance to unlock
sb until been wakeup by xlog_ioend_work.

xfs_sync_sb_buf
    xfs_trans_getsb // lock sb buf
    xfs_trans_bhold // sb buf keep lock until success commit
    xfs_trans_commit
    ...
      xfs_log_force_seq
        xlog_force_lsn
          xlog_wait_on_iclog
            xlog_wait(&iclog->ic_force_wait... // shutdown happened
    xfs_buf_relse // unlock sb buf

xlog_ioend_work
    xlog_force_shutdown
      xlog_state_shutdown_callbacks
        xlog_cil_process_committed
          xlog_cil_committed
          ...
          xfs_buf_item_unpin
            xfs_buf_lock // deadlock
        wake_up_all(&iclog->ic_force_wait)

xfs_ioc_setlabel use xfs_sync_sb_buf to make sure userspace will see the
change for sb immediately. We can simply call xfs_ail_push_all_sync to
do this and sametime fix the deadlock.

Why is this deadlock specific to the superblock buffer?

Hi Dave,

Thanks a lot for your revirew! We find this problem when do some code
reading(which can help us to fix another growfs bug). And then reproduce it
easily when we set label online frequently with IO error inject at the
sametime.

Right, I know how it can be triggered; that's not actually my
concern...

Can't any buffer that is held locked over a synchronous transaction
commit deadlock during a shutdown like this?

After check all place use xfs_buf_bhold, it seems xfs_sync_sb_buf is the
only convict that combine use xfs_trans_hold and xfs_trans_set_sync(I'm not
familiar with xfs yet, so I may have some problems with my code check)...

Yes, I can also see that. But my concern is that this change only
addresses the symptom, but leaves the underlying deadlock unsolved.

Indeed, this isn't xfs_trans_commit() I'm worried about here; it's
the call to xfs_log_force(mp, XFS_LOG_SYNC) or
xfs_log_force_seq(XFS_LOG_SYNC) with a buffer held locked that I'm
worried about.

i.e. We have a buffer in the CIL (from a previous transaction) that
we currently hold locked while we call xfs_log_force(XFS_LOG_SYNC).
If a shutdown occurs while we are waiting for journal IO completion
to occur, then xlog_ioend_work() will attempt to lock the buffer and
deadlock, right?

e.g. I'm thinking of things like busy extent flushing (hold AGF +
AGFL + AG btree blocks locked when we call xfs_log_force()) could
also be vulnerable to the same deadlock...

You mean something like xfs_allocbt_alloc_block(call xfs_log_force to
flush busy extent which keep agf locked sametime)?

We call xfs_log_force(mp, XFS_LOG_SYNC) after lock agf and before
xfs_trans_commit. It seems ok since xfs_buf_item_unpin will not call
xfs_buf_lock because bli_refcount still keep active(once we hold locked
agf, the bli_refcount will inc in _xfs_trans_bjoin, and keep it until
xfs_trans_commit success(clean agf item) or .iop_unpin(dirty agf item,
call from xlog_ioend_work) which can be called after xfs_trans_commit
too)...

If that's true, how do we avoid the shutdown from causing a deadlock
in these situations?

Cheers,

Dave.