On Fri, Mar 05, 2021 at 10:57:01AM +0800, Gao Xiang wrote: > This patch introduces a helper to shrink unused space in the last AG > by fixing up the freespace btree. > > Also make sure that the per-AG reservation works under the new AG > size. If such per-AG reservation or extent allocation fails, roll > the transaction so the new transaction could cancel without any side > effects. > > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx> > --- Looks mostly good to me. Some nits.. > fs/xfs/libxfs/xfs_ag.c | 111 +++++++++++++++++++++++++++++++++++++++++ > fs/xfs/libxfs/xfs_ag.h | 4 +- > 2 files changed, 114 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > index 9331f3516afa..1f6f9e70e1cb 100644 > --- a/fs/xfs/libxfs/xfs_ag.c > +++ b/fs/xfs/libxfs/xfs_ag.c ... > @@ -485,6 +490,112 @@ xfs_ag_init_headers( > return error; > } > > +int > +xfs_ag_shrink_space( > + struct xfs_mount *mp, > + struct xfs_trans **tpp, > + xfs_agnumber_t agno, > + xfs_extlen_t delta) > +{ > + struct xfs_alloc_arg args = { > + .tp = *tpp, > + .mp = mp, > + .type = XFS_ALLOCTYPE_THIS_BNO, > + .minlen = delta, > + .maxlen = delta, > + .oinfo = XFS_RMAP_OINFO_SKIP_UPDATE, > + .resv = XFS_AG_RESV_NONE, > + .prod = 1 > + }; > + struct xfs_buf *agibp, *agfbp; > + struct xfs_agi *agi; > + struct xfs_agf *agf; > + int error, err2; > + > + ASSERT(agno == mp->m_sb.sb_agcount - 1); > + error = xfs_ialloc_read_agi(mp, *tpp, agno, &agibp); > + if (error) > + return error; > + > + agi = agibp->b_addr; > + > + error = xfs_alloc_read_agf(mp, *tpp, agno, 0, &agfbp); > + if (error) > + return error; > + > + agf = agfbp->b_addr; > + if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length)) > + return -EFSCORRUPTED; Is this check here for a reason? It seems a bit random, so I wonder if we should just leave the extra verification to buffer verifiers. > + > + if (delta >= agi->agi_length) > + return -EINVAL; > + > + args.fsbno = XFS_AGB_TO_FSB(mp, agno, > + be32_to_cpu(agi->agi_length) - delta); > + > + /* remove the preallocations before allocation and re-establish then */ The comment is a little confusing. Perhaps something like the following, if accurate..? /* * Disable perag reservations so it doesn't cause the allocation request * to fail. We'll reestablish reservation before we return. */ > + error = xfs_ag_resv_free(agibp->b_pag); > + if (error) > + return error; > + > + /* internal log shouldn't also show up in the free space btrees */ > + error = xfs_alloc_vextent(&args); > + if (!error && args.agbno == NULLAGBLOCK) > + error = -ENOSPC; > + > + if (error) { > + /* > + * if extent allocation fails, need to roll the transaction to > + * ensure that the AGFL fixup has been committed anyway. > + */ > + err2 = xfs_trans_roll(tpp); > + if (err2) > + return err2; > + goto resv_init_out; So if this fails and the transaction rolls, do we still hold the agi/agf buffers here? If not, there might be a window of time where it's possible for some other task to come in and alloc out of the AG without the perag res being active. > + } > + > + /* > + * if successfully deleted from freespace btrees, need to confirm > + * per-AG reservation works as expected. > + */ > + be32_add_cpu(&agi->agi_length, -delta); > + be32_add_cpu(&agf->agf_length, -delta); > + > + err2 = xfs_ag_resv_init(agibp->b_pag, *tpp); > + if (err2) { > + be32_add_cpu(&agi->agi_length, delta); > + be32_add_cpu(&agf->agf_length, delta); > + if (err2 != -ENOSPC) > + goto resv_err; > + > + __xfs_bmap_add_free(*tpp, args.fsbno, delta, NULL, true); > + > + /* > + * Roll the transaction before trying to re-init the per-ag > + * reservation. The new transaction is clean so it will cancel > + * without any side effects. > + */ > + error = xfs_defer_finish(tpp); > + if (error) > + return error; > + > + error = -ENOSPC; > + goto resv_init_out; > + } > + xfs_ialloc_log_agi(*tpp, agibp, XFS_AGI_LENGTH); > + xfs_alloc_log_agf(*tpp, agfbp, XFS_AGF_LENGTH); > + return 0; > + > +resv_init_out: > + err2 = xfs_ag_resv_init(agibp->b_pag, *tpp); > + if (!err2) > + return error; > +resv_err: > + xfs_warn(mp, "Error %d reserving per-AG metadata reserve pool.", err2); > + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); > + return err2; > +} > + > /* > * Extent the AG indicated by the @id by the length passed in > */ > diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h > index 5166322807e7..41293ebde8da 100644 > --- a/fs/xfs/libxfs/xfs_ag.h > +++ b/fs/xfs/libxfs/xfs_ag.h > @@ -24,8 +24,10 @@ struct aghdr_init_data { > }; > > int xfs_ag_init_headers(struct xfs_mount *mp, struct aghdr_init_data *id); > +int xfs_ag_shrink_space(struct xfs_mount *mp, struct xfs_trans **tpp, > + xfs_agnumber_t agno, xfs_extlen_t len); > int xfs_ag_extend_space(struct xfs_mount *mp, struct xfs_trans *tp, > - struct aghdr_init_data *id, xfs_extlen_t len); > + struct aghdr_init_data *id, xfs_extlen_t delta); This looks misplaced..? Or maybe this is trying to make the APIs consistent, but the function definition still uses len as well as the declaration for _ag_shrink_space() (while the definition of that function uses delta). FWIW, the name delta tends to suggest a signed value to me based on our pattern of usage, whereas here it seems like these helpers always want a positive value (i.e. a length). Brian > int xfs_ag_get_geometry(struct xfs_mount *mp, xfs_agnumber_t agno, > struct xfs_ag_geometry *ageo); > > -- > 2.27.0 >