On Wed, May 12, 2021 at 01:22:49PM +0100, Christoph Hellwig wrote: > On Tue, May 11, 2021 at 06:52:44PM -0700, Darrick J. Wong wrote: > > > is unpinned if the associated item has been aborted and will require > > > a simulated I/O failure. The hold is already required for the > > > simulated I/O failure, so the ordering simply guarantees the unpin > > > handler access to the buffer before it is unpinned and thus > > > processed by the AIL. This particular ordering is required so long > > > as the AIL does not acquire a reference on the bli, which is the > > > long term solution to this problem. > > > > Are you working on that too, or are we just going to let that lie for > > the time being? :) > > Wouldn't that be as simple as something like the untested patch below? > I actually think this is moderately less simple than the RFC I started with (see the cover letter for a reference) because there's really no need for a buffer hold per pin. I moved away from the RFC approach to this to 1. isolate the hold/rele cycle to the scenario where it's actually necessary (unpin abort) and 2. document the design flaw that Dave had pointed out that contributes to this problem. So point #1 means the explicit hold basically fills the gap that the bli reference count fails to cover to preserve buffer access by (AIL resident) log item processing code, and no more, whereas the RFC and the patch below are a bit more convoluted (even though the code might look simpler) in that they obscure that context. Brian > > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c > index fb69879e4b2b..07e08713ecd4 100644 > --- a/fs/xfs/xfs_buf_item.c > +++ b/fs/xfs/xfs_buf_item.c > @@ -471,6 +471,7 @@ xfs_buf_item_pin( > trace_xfs_buf_item_pin(bip); > > atomic_inc(&bip->bli_refcount); > + xfs_buf_hold(bip->bli_buf); > atomic_inc(&bip->bli_buf->b_pin_count); > } > > @@ -552,14 +553,15 @@ xfs_buf_item_unpin( > xfs_buf_relse(bp); > } else if (freed && remove) { > /* > - * The buffer must be locked and held by the caller to simulate > - * an async I/O failure. > + * The buffer must be locked to simulate an async I/O failure. > + * xfs_buf_ioend_fail will drop our buffer reference. > */ > xfs_buf_lock(bp); > - xfs_buf_hold(bp); > bp->b_flags |= XBF_ASYNC; > xfs_buf_ioend_fail(bp); > + return; > } > + xfs_buf_rele(bp); > } > > STATIC uint >