On Wed, Jul 14, 2021 at 10:31:25AM +1000, Dave Chinner wrote: > On Tue, Jul 13, 2021 at 05:00:08PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > The refcount and rmap finish_one functions stash a btree cursor when > > there are multiple ->finish_one calls to be made in a single > > transaction. This mechanism is how we maintain the AGF lock between > > operations of a single intent item. Since ag btree cursors now need > > active references to perag structures, we must preserve the perag > > reference when we save the cursor. > > Hmmm. The cursor already carries it's own internal reference. So > this: > > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > fs/xfs/libxfs/xfs_refcount.c | 33 ++++++++++++++++++++------------- > > fs/xfs/libxfs/xfs_rmap.c | 8 +++++++- > > 2 files changed, 27 insertions(+), 14 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c > > index 860a0c9801ba..cfd98958d38c 100644 > > --- a/fs/xfs/libxfs/xfs_refcount.c > > +++ b/fs/xfs/libxfs/xfs_refcount.c > > @@ -1113,13 +1113,16 @@ xfs_refcount_finish_one_cleanup( > > int error) > > { > > struct xfs_buf *agbp; > > + struct xfs_perag *pag; > > > > if (rcur == NULL) > > return; > > agbp = rcur->bc_ag.agbp; > > + pag = rcur->bc_ag.pag; > > xfs_btree_del_cursor(rcur, error); > > if (error) > > xfs_trans_brelse(tp, agbp); > > + xfs_perag_put(pag); > > ... is just duplicating the reference the cursor already carries and > drops inside xfs_btree_del_cursor(). > > What problem is this actually fixing? LOL, I forgot that the btree cursor increments pag_ref on its own. Self-NAK for this patch then. --D > > > } > > > > /* > > @@ -1142,19 +1145,20 @@ xfs_refcount_finish_one( > > struct xfs_mount *mp = tp->t_mountp; > > struct xfs_btree_cur *rcur; > > struct xfs_buf *agbp = NULL; > > - int error = 0; > > + struct xfs_perag *pag; > > + unsigned long nr_ops = 0; > > + xfs_agnumber_t agno; > > xfs_agblock_t bno; > > xfs_agblock_t new_agbno; > > - unsigned long nr_ops = 0; > > int shape_changes = 0; > > - struct xfs_perag *pag; > > + int error = 0; > > > > - pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, startblock)); > > + agno = XFS_FSB_TO_AGNO(mp, startblock); > > + pag = xfs_perag_get(mp, agno); > > bno = XFS_FSB_TO_AGBNO(mp, startblock); > > > > - trace_xfs_refcount_deferred(mp, XFS_FSB_TO_AGNO(mp, startblock), > > - type, XFS_FSB_TO_AGBNO(mp, startblock), > > - blockcount); > > + trace_xfs_refcount_deferred(mp, agno, type, > > + XFS_FSB_TO_AGBNO(mp, startblock), blockcount); > > > > if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REFCOUNT_FINISH_ONE)) { > > error = -EIO; > > @@ -1174,14 +1178,16 @@ xfs_refcount_finish_one( > > *pcur = NULL; > > } > > if (rcur == NULL) { > > - error = xfs_alloc_read_agf(tp->t_mountp, tp, pag->pag_agno, > > + error = xfs_alloc_read_agf(mp, tp, agno, > > XFS_ALLOC_FLAG_FREEING, &agbp); > > Please don't revert these back to using a local variable. The next > step in cleaning up all these agf/agi read functions is to pass the > perag into them rather than the mount/agno pair.... > > > if (error) > > goto out_drop; > > > > + /* The cursor now owns the AGF buf and perag ref */ > > rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, pag); > > rcur->bc_ag.refc.nr_ops = nr_ops; > > rcur->bc_ag.refc.shape_changes = shape_changes; > > + pag = NULL; > > The cursor takes it's own reference inside > xfs_refcountbt_init_cursor() that covers the perag for the life of > the cursor. THe local get/put covers the perag for this function, > and guarantees that the init_cursor() function can get it's own > reference without blocking because the perag already has active > references. > > Also, the cursor doesn't actually own the agbp at all. The active > reference to the agbp is actually carried by the transaction, not > the cursor, and if it is dirty when xfs_trans_brelse() is called, > then transaction reference is not dropped until > xfs_trans_commit()... > > Hence I think you're conflating "reference counted object" > with "cursor contains an object pointer" here, and as such the > statements about both objects are incorrect for different reasons... > > > } > > *pcur = rcur; > > > > @@ -1189,12 +1195,12 @@ xfs_refcount_finish_one( > > case XFS_REFCOUNT_INCREASE: > > error = xfs_refcount_adjust(rcur, bno, blockcount, &new_agbno, > > new_len, XFS_REFCOUNT_ADJUST_INCREASE, NULL); > > - *new_fsb = XFS_AGB_TO_FSB(mp, pag->pag_agno, new_agbno); > > + *new_fsb = XFS_AGB_TO_FSB(mp, agno, new_agbno); > > break; > > case XFS_REFCOUNT_DECREASE: > > error = xfs_refcount_adjust(rcur, bno, blockcount, &new_agbno, > > new_len, XFS_REFCOUNT_ADJUST_DECREASE, NULL); > > - *new_fsb = XFS_AGB_TO_FSB(mp, pag->pag_agno, new_agbno); > > + *new_fsb = XFS_AGB_TO_FSB(mp, agno, new_agbno); > > break; > > case XFS_REFCOUNT_ALLOC_COW: > > *new_fsb = startblock + blockcount; > > @@ -1211,10 +1217,11 @@ xfs_refcount_finish_one( > > error = -EFSCORRUPTED; > > } > > if (!error && *new_len > 0) > > - trace_xfs_refcount_finish_one_leftover(mp, pag->pag_agno, type, > > - bno, blockcount, new_agbno, *new_len); > > + trace_xfs_refcount_finish_one_leftover(mp, agno, type, bno, > > + blockcount, new_agbno, *new_len); > > out_drop: > > - xfs_perag_put(pag); > > + if (pag) > > + xfs_perag_put(pag); > > return error; > > Yup, this just smells wrong. The local get/put covers the reference > for the local function references, the reference gained in > _init_cursor ensures the perag is referenced for the life of the > cursor across multiple iterations (including duplicated child > cursors that also take their own references). > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx