On Tue, Jan 12, 2021 at 02:48:35AM +0800, Gao Xiang wrote: > Hi Darrick, > > On Mon, Jan 11, 2021 at 10:17:53AM -0800, Darrick J. Wong wrote: > > On Mon, Jan 11, 2021 at 09:22:43PM +0800, Gao Xiang wrote: > > ... > > > > +int > > > +xfs_ag_shrink_space( > > > + struct xfs_mount *mp, > > > + struct xfs_trans *tp, > > > + struct aghdr_init_data *id, > > > + xfs_extlen_t len) > > > +{ > > > + struct xfs_buf *agibp, *agfbp; > > > + struct xfs_agi *agi; > > > + struct xfs_agf *agf; > > > + int error, err2; > > > + struct xfs_alloc_arg args = { > > > + .tp = tp, > > > + .mp = mp, > > > + .type = XFS_ALLOCTYPE_THIS_BNO, > > > + .minlen = len, > > > + .maxlen = len, > > > + .oinfo = XFS_RMAP_OINFO_SKIP_UPDATE, > > > + .resv = XFS_AG_RESV_NONE, > > > + .prod = 1 > > > + }; > > > > Dumb style note: I usually put onstack structs at the top of the > > variable declaration list so that the variables are sorted in order of > > decreasing size. > > Thanks for your detailed review! > > Ok, will move upwards in the next version. > > > > > > + > > > + ASSERT(id->agno == mp->m_sb.sb_agcount - 1); > > > + error = xfs_ialloc_read_agi(mp, tp, id->agno, &agibp); > > > + if (error) > > > + return error; > > > + > > > + agi = agibp->b_addr; > > > + > > > + error = xfs_alloc_read_agf(mp, tp, id->agno, 0, &agfbp); > > > + if (error) > > > + return error; > > > + > > > + args.fsbno = XFS_AGB_TO_FSB(mp, id->agno, > > > + be32_to_cpu(agi->agi_length) - len); > > > + > > > + /* remove the preallocations before allocation and re-establish then */ > > > + error = xfs_ag_resv_free(agibp->b_pag); > > > + if (error) > > > + return error; > > > + > > > + /* internal log shouldn't also show up in the free space btrees */ > > > + error = xfs_alloc_vextent(&args); > > > + if (error) > > > + goto out; > > > + > > > + if (args.agbno == NULLAGBLOCK) { > > > + error = -ENOSPC; > > > + goto out; > > > + } > > > > Hmm. So if the allocation above takes the last free space in the AG, we > > won't be able to satisfy the new per-AG allocation below, which could > > lead to a fs shutdown later (and even after subsequent mount attempts) > > until enough space gets freed. That's not good. > > > > I think we should update the length fields in agibp and agfbp, and then > > try to re-initialize the per-ag reservation. If that succeeds, we can > > log the agi and agf and return. If the reinitialization returns a > > non-ENOSPC error code then we of course just return the error code and > > let the fs shut down. > > > > However, if the reinit returns ENOSPC, then we have to back out > > everything we've changed so far. That means restore the old values of > > the agibp/agfbp lengths, log an EFI to free the space we just allocated, > > and return. The caller then has to be smart enough to commit the > > transaction before passing the ENOSPC back to its own caller. > > > > I think I get the point. I might need to reconfirm such case first to > verify new modification. (Although I think new reserve sizes could have > some way to be calculated in advance, yet after having seen that, I > might not want to touch such logic, so will try the way you mentioned... > Thanks!) > > > > + > > > + /* Change the agi length */ > > > + be32_add_cpu(&agi->agi_length, -len); > > > + xfs_ialloc_log_agi(tp, agibp, XFS_AGI_LENGTH); > > > + > > > + /* Change agf length */ > > > + agf = agfbp->b_addr; > > > + be32_add_cpu(&agf->agf_length, -len); > > > + ASSERT(agf->agf_length == agi->agi_length); > > > > Maybe we should check that these two lengths are the same right after we > > grab the AG[IF] buffers and return EFSCORRUPTED? > > Honestly, the original purpose of this wasn't for sanity check.. So I only > left an ASSERT here... > > If formal sanity check is needed, I will update this... Well, it's ondisk metadata, so it's perfectly valid to do a formal sanity check and bail out if things don't look right. :) --D > > > > > + xfs_alloc_log_agf(tp, agfbp, XFS_AGF_LENGTH); > > > + > > > +out: > > > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > > > + if (err2 && err2 != -ENOSPC) { > > > > I think we need to warn if err2 is ENOSPC here, since we're now running > > with a (slightly) compromised filesystem. > > Ok, will update. > > > > > > + xfs_warn(mp, > > > +"Error %d reserving per-AG metadata reserve pool.", err2); > > > + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); > > > + return err2; > > > + } > > > + return error; > > > +} > > > + > > > /* > > > * Extent the AG indicated by the @id by the length passed in > > > */ > > > diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h > > > index 5166322807e7..f3b5bbfeadce 100644 > > > --- a/fs/xfs/libxfs/xfs_ag.h > > > +++ b/fs/xfs/libxfs/xfs_ag.h > > > @@ -24,6 +24,8 @@ struct aghdr_init_data { > > > }; > > > > > > int xfs_ag_init_headers(struct xfs_mount *mp, struct aghdr_init_data *id); > > > +int xfs_ag_shrink_space(struct xfs_mount *mp, struct xfs_trans *tp, > > > + struct aghdr_init_data *id, xfs_extlen_t len); > > > int xfs_ag_extend_space(struct xfs_mount *mp, struct xfs_trans *tp, > > > struct aghdr_init_data *id, xfs_extlen_t len); > > > int xfs_ag_get_geometry(struct xfs_mount *mp, xfs_agnumber_t agno, > > > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c > > > index 8fde7a2989ce..6ee9ea4d5a67 100644 > > > --- a/fs/xfs/xfs_fsops.c > > > +++ b/fs/xfs/xfs_fsops.c > > > @@ -26,7 +26,7 @@ xfs_resizefs_init_new_ags( > > > struct aghdr_init_data *id, > > > xfs_agnumber_t oagcount, > > > xfs_agnumber_t nagcount, > > > - xfs_rfsblock_t *delta) > > > + int64_t *delta) > > > { > > > xfs_rfsblock_t nb = mp->m_sb.sb_dblocks + *delta; > > > int error; > > > @@ -76,22 +76,28 @@ xfs_growfs_data_private( > > > xfs_agnumber_t nagcount; > > > xfs_agnumber_t nagimax = 0; > > > xfs_rfsblock_t nb, nb_div, nb_mod; > > > - xfs_rfsblock_t delta; > > > + int64_t delta; > > > xfs_agnumber_t oagcount; > > > struct xfs_trans *tp; > > > + bool extend; > > > struct aghdr_init_data id = {}; > > > > > > nb = in->newblocks; > > > - if (nb < mp->m_sb.sb_dblocks) > > > - return -EINVAL; > > > - if ((error = xfs_sb_validate_fsb_count(&mp->m_sb, nb))) > > > + if (nb == mp->m_sb.sb_dblocks) > > > + return 0; > > > + > > > + error = xfs_sb_validate_fsb_count(&mp->m_sb, nb); > > > + if (error) > > > return error; > > > - error = xfs_buf_read_uncached(mp->m_ddev_targp, > > > + > > > + if (nb > mp->m_sb.sb_dblocks) { > > > + error = xfs_buf_read_uncached(mp->m_ddev_targp, > > > XFS_FSB_TO_BB(mp, nb) - XFS_FSS_TO_BB(mp, 1), > > > XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL); > > > - if (error) > > > - return error; > > > - xfs_buf_relse(bp); > > > + if (error) > > > + return error; > > > + xfs_buf_relse(bp); > > > + } > > > > > > nb_div = nb; > > > nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks); > > > @@ -99,10 +105,12 @@ xfs_growfs_data_private( > > > if (nb_mod && nb_mod < XFS_MIN_AG_BLOCKS) { > > > nagcount--; > > > nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; > > > - if (nb < mp->m_sb.sb_dblocks) > > > + if (nagcount < 2) > > > return -EINVAL; > > > } > > > + > > > delta = nb - mp->m_sb.sb_dblocks; > > > + extend = (delta > 0); > > > oagcount = mp->m_sb.sb_agcount; > > > > > > /* allocate the new per-ag structures */ > > > @@ -110,22 +118,34 @@ xfs_growfs_data_private( > > > error = xfs_initialize_perag(mp, nagcount, &nagimax); > > > if (error) > > > return error; > > > + } else if (nagcount != oagcount) { > > > + /* TODO: shrinking the entire AGs hasn't yet completed */ > > > + return -EINVAL; > > > } > > > > > > error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, > > > - XFS_GROWFS_SPACE_RES(mp), 0, XFS_TRANS_RESERVE, &tp); > > > + (extend ? XFS_GROWFS_SPACE_RES(mp) : -delta), 0, > > > + XFS_TRANS_RESERVE, &tp); > > > if (error) > > > return error; > > > > > > - error = xfs_resizefs_init_new_ags(mp, &id, oagcount, nagcount, &delta); > > > - if (error) > > > - goto out_trans_cancel; > > > - > > > + if (extend) { > > > + error = xfs_resizefs_init_new_ags(mp, &id, oagcount, > > > + nagcount, &delta); > > > + if (error) > > > + goto out_trans_cancel; > > > + } > > > xfs_trans_agblocks_delta(tp, id.nfree); > > > > > > - /* If there are new blocks in the old last AG, extend it. */ > > > + /* If there are some blocks in the last AG, resize it. */ > > > if (delta) { > > > - error = xfs_ag_extend_space(mp, tp, &id, delta); > > > + if (extend) { > > > + error = xfs_ag_extend_space(mp, tp, &id, delta); > > > + } else { > > > + id.agno = nagcount - 1; > > > + error = xfs_ag_shrink_space(mp, tp, &id, -delta); > > > + } > > > + > > > if (error) > > > goto out_trans_cancel; > > > } > > > @@ -137,11 +157,19 @@ xfs_growfs_data_private( > > > */ > > > if (nagcount > oagcount) > > > xfs_trans_mod_sb(tp, XFS_TRANS_SB_AGCOUNT, nagcount - oagcount); > > > - if (nb > mp->m_sb.sb_dblocks) > > > + if (nb != mp->m_sb.sb_dblocks) > > > xfs_trans_mod_sb(tp, XFS_TRANS_SB_DBLOCKS, > > > nb - mp->m_sb.sb_dblocks); > > > if (id.nfree) > > > xfs_trans_mod_sb(tp, XFS_TRANS_SB_FDBLOCKS, id.nfree); > > > + > > > + /* > > > + * update in-core counters (especially sb_fdblocks) now > > > + * so xfs_validate_sb_write() can pass. > > > + */ > > > + if (xfs_sb_version_haslazysbcount(&mp->m_sb)) > > > + xfs_log_sb(tp); > > > + > > > xfs_trans_set_sync(tp); > > > error = xfs_trans_commit(tp); > > > if (error) > > > @@ -157,7 +185,7 @@ xfs_growfs_data_private( > > > * If we expanded the last AG, free the per-AG reservation > > > * so we can reinitialize it with the new size. > > > */ > > > - if (delta) { > > > + if (delta > 0) { > > > struct xfs_perag *pag; > > > > > > pag = xfs_perag_get(mp, id.agno); > > > @@ -178,6 +206,10 @@ xfs_growfs_data_private( > > > return error; > > > > > > out_trans_cancel: > > > + if (!extend && (tp->t_flags & XFS_TRANS_DIRTY)) { > > > > If you're going to have a conditional commit in what looks like the > > error return cleanup path, it would be a good idea to leave a comment > > here explaining when we can end up in this condition. > > Ok, Will add some words as well. > > Thanks, > Gao Xiang > > > > > --D > > > > > + xfs_trans_commit(tp); > > > + return error; > > > + } > > > xfs_trans_cancel(tp); > > > return error; > > > } > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > > index e72730f85af1..fd2cbf414b80 100644 > > > --- a/fs/xfs/xfs_trans.c > > > +++ b/fs/xfs/xfs_trans.c > > > @@ -419,7 +419,6 @@ xfs_trans_mod_sb( > > > tp->t_res_frextents_delta += delta; > > > break; > > > case XFS_TRANS_SB_DBLOCKS: > > > - ASSERT(delta > 0); > > > tp->t_dblocks_delta += delta; > > > break; > > > case XFS_TRANS_SB_AGCOUNT: > > > -- > > > 2.27.0 > > > > > >