On Thu, Oct 13, 2022 at 04:19:15PM -0700, Darrick J. Wong wrote: > On Fri, Oct 14, 2022 at 09:25:53AM +1100, Dave Chinner wrote: > > On Sun, Oct 02, 2022 at 11:19:48AM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > Currently, the only way to lock an allocation group is to hold the AGI > > > and AGF buffers. If repair needs to roll the transaction while > > > repairing some AG metadata, it maintains that lock by holding the two > > > buffers across the transaction roll and joins them afterwards. > > > > > > However, repair is not the same as the other parts of XFS that employ > > > this bhold/bjoin sequence, because it's possible that the AGI or AGF > > > buffers are not actually dirty before the roll. In this case, the > > > buffer log item can detach from the buffer, which means that we have to > > > > Doesn't this imply we have a reference counting problem with > > XFS_BLI_HOLD buffers? i.e. the bli can only get detached when the > > reference count on it goes to zero. If the buffer is clean and > > joined to a transaction, then that means the only reference to the > > BLI is the current transaction. Hence the only way it can get > > detached is for the transaction commit to release the current > > transaction's reference to the BLI. > > > > Ah, XFS_BLI_HOLD does not take a reference to the BLI - it just > > prevents ->iop_release from releasing the -buffer- after it drops > > the transaction reference to the BLI. That's the problem right there > > - xfs_buf_item_release() drops the current trans ref to the clean > > item via xfs_buf_item_release() regardless of whether BLI_HOLD is > > set or not, hence freeing the BLI on clean buffers. > > > > IOWs, it looks to me like XFS_BLI_HOLD should actually hold a > > reference to the BLI as well as the buffer so that we don't free the > > BLI for a held clean buffer in xfs_buf_item_release(). The reference > > we leave behind will then be picked up by the subsequent call to > > xfs_trans_bjoin() which finds the clean BLI already attached to the > > buffer... > > <nod> I think you're saying that _xfs_trans_bjoin should: > > if (!(bip->bli_flags & XFS_BLI_HOLD)) > atomic_inc(&bip->bli_refcount); > > and xfs_buf_item_release should do: > > if (hold) > return; > released = xfs_buf_item_put(bip); > if (stale && !released) > return; Not exactly. What I was thinking was something like this: xfs_trans_bhold() should do: bip->bli_flags |= XFS_BLI_HOLD; atomic_inc(&bip->bli_refcount); xfs_trans_bhold_release() should do: bip->bli_flags &= ~XFS_BLI_HOLD; atomic_dec(&bip->bli_refcount); xfs_trans_brelse() shoudl do: if (bip->bli_flags & XFS_BLI_HOLD) { bip->bli_flags &= ~XFS_BLI_HOLD; atomic_dec(&bip->bli_refcount); } and xfs_buf_item_release() should do: if (hold) { /* Release the hold ref but not the rolling transaction ref */ bip->bli_flags &= ~XFS_BLI_HOLD; atomic_dec(&bip->bli_refcount); return; } released = xfs_buf_item_put(bip); if (stale && !released) return; Then _xfs_trans_bjoin() remains unchanged, as we don't remove the BLI from the held clean buffer as there is still a reference to it. The new transaction we rejoin the buffer to will take that reference. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx