[PATCH RFC] xfs: hold buffer across unpin and potential shutdown processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The special processing used to simulate a buffer I/O failure on fs
shutdown has a difficult to reproduce race that can result in a use
after free of the associated buffer. Consider a buffer that has been
committed to the on-disk log and thus is AIL resident. The buffer
lands on the writeback delwri queue, but is subsequently locked,
committed and pinned by another transaction before submitted for
I/O. At this point, the buffer is stuck on the delwri queue as it
cannot be submitted for I/O until it is unpinned. A log checkpoint
I/O failure occurs sometime later, which aborts the bli. The unpin
handler is called with the aborted log item, drops the bli reference
count, the pin count, and falls into the I/O failure simulation
path.

The potential problem here is that once the pin count falls to zero
in ->iop_unpin(), xfsaild is free to retry delwri submission of the
buffer at any time, before the unpin handler even completes. If
delwri queue submission wins the race to the buffer lock, it
observes the shutdown state and simulates the I/O failure itself.
This releases both the bli and delwri queue holds and frees the
buffer while xfs_buf_item_unpin() sits on xfs_buf_lock() waiting to
run through the same failure sequence. This problem is rare and
requires many iterations of fstest generic/019 (which simulates disk
I/O failures) to reproduce.

To avoid this problem, hold the buffer across the unpin sequence in
xfs_buf_item_unpin(). This is a bit unfortunate in that the new hold
is unconditional while really only necessary for a rare, fatal error
scenario, but it guarantees the buffer still exists in the off
chance that the handler attempts to access it.

Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
---

This is a patch I've had around for a bit for a very rare corner case I
was able to reproduce in some past testing. I'm sending this as RFC
because I'm curious if folks have any thoughts on the approach. I'd be
Ok with this change as is, but I think there are alternatives available
too. We could do something fairly simple like bury the hold in the
remove (abort) case only, or perhaps consider checking IN_AIL state
before the pin count drops and base on that (though that seems a bit
more fragile to me). Thoughts?

Brian

 fs/xfs/xfs_buf_item.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index fb69879e4b2b..a1ad6901eb15 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -504,6 +504,7 @@ xfs_buf_item_unpin(
 
 	freed = atomic_dec_and_test(&bip->bli_refcount);
 
+	xfs_buf_hold(bp);
 	if (atomic_dec_and_test(&bp->b_pin_count))
 		wake_up_all(&bp->b_waiters);
 
@@ -560,6 +561,7 @@ xfs_buf_item_unpin(
 		bp->b_flags |= XBF_ASYNC;
 		xfs_buf_ioend_fail(bp);
 	}
+	xfs_buf_rele(bp);
 }
 
 STATIC uint
-- 
2.26.3




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux