在 2023/5/8 15:48, yangerkun 写道:
在 2023/5/8 11:34, Dave Chinner 写道:
Hi yangerkun,
Sorry to take so long to get to this, I've been busy with other
stuff and I needed to do some thinking on it first.
On Thu, Apr 20, 2023 at 11:35:50AM +0800, yangerkun wrote:
From: yangerkun <yangerkun@xxxxxxxxxx>
commit 84d8949e7707 ("xfs: hold buffer across unpin and potential
shutdown processing") describle a use-after-free bug show as follow.
Call xfs_buf_hold before dec b_pin_count can forbid the problem.
+-----------------------------+--------------------------------+
xlog_ioend_work | xfsaild
... | xfs_buf_delwri_submit_buffers
xfs_buf_item_unpin |
dec &bip->bli_refcount |
dec &bp->b_pin_count |
| // check unpin and go on
| __xfs_buf_submit
| xfs_buf_ioend_fail // shutdown
| xfs_buf_ioend
| xfs_buf_relse
| xfs_buf_free(bp)
xfs_buf_lock(bp) // UAF |
However with the patch, we still get a UAF with shutdown:
+-----------------------------+--------------------------------+
xlog_ioend_work | xlog_cil_push_work // now shutdown
... | xlog_cil_committed
xfs_buf_item_unpin | ...
// bli_refcount = 2 |
dec bli_refcount // 1 | xfs_buf_item_unpin
| dec bli_refcount // 0,will free
| xfs_buf_ioend_fail // free bp
dec b_pin_count // UAF |
Ok, so the race condition here is that we have two callers racing to
run xlog_cil_committed(). We have xlog_ioend_work() doing the
shutdown callbacks for checkpoint contexts that have been aborted
after submission, and xlog_cil_push_work aborting a checkpoint
context before it has been submitted.
xlog_cil_push_work will call xlog_cil_committed once we meet some error
like shutdown, and then call xfs_buf_item_unpin with 'remove' equals 1.
xlog_ioend_work can happened same time which trigger xfs_buf_item_unpin
too, and then bli_refcount will down to zero which trigger
xfs_buf_ioend_fail that free the xfs_buf, so the UAF can trigger.
Fix it by call xfs_buf_hold before dec bli_refcount, and release the
hold once we actually do not need it.
Ok, that works.
However, adding an unconditional buffer reference to each unpin call
so that we can safely reference the buffer after we're released the
BLI indicates that the BLI buffer reference is not guaranteeing
buffer existence once the bli reference for the current pin the bli
holds.
Which means that we need a buffer reference per pin count that is
added. We can then hold that buffer reference in the unpin
processing until we don't need it anymore, and we cover all the
known cases (and any unknown cases) without needing special case
code?
Say, something like the (untested) patch I've attached below?
Hi Dave,
This solution looks great to me! I will run the testcase trigger the uaf
with your patch!
Hi Dave,
This patch works well. Thanks again for your fix!
Thanks,
yangerkun.
Thanks,
yangerkun.
Cheers,
Dave.