On Mon, Jun 29, 2020 at 08:22:28AM -0400, Brian Foster wrote: > On Thu, Jun 25, 2020 at 01:52:39PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > In commit 9851fd79bfb1, we added a slight amount of slack to the free > > space btrees being reconstructed so that the initial fix_freelist call > > (which is run against a totally empty AGFL) would never have to split > > either free space btree in order to populate the free list. > > > > The new btree bulk loading code in xfs_repair can re-create this > > situation because it can set the slack values to zero if the filesystem > > is very full. However, these days repair has the infrastructure needed > > to ensure that overestimations of the btree block counts end up on the > > AGFL or get freed back into the filesystem at the end of phase 5. > > > > Fix this problem by reserving blocks to a separate AGFL block > > reservation, and checking that between this new reservation and any > > overages in the bnobt/cntbt fakeroots, we have enough blocks sitting > > around to populate the AGFL with the minimum number of blocks it needs > > to handle a split in the bno/cnt/rmap btrees. > > > > Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end > > of the reservation steps in phase 5, so the extra allocation should not > > cause repair to fail if it can't find blocks for btrees. > > > > Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required") > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > --- > > repair/agbtree.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++------- > > 1 file changed, 68 insertions(+), 10 deletions(-) > > > > > > diff --git a/repair/agbtree.c b/repair/agbtree.c > > index 339b1489..7a4f316c 100644 > > --- a/repair/agbtree.c > > +++ b/repair/agbtree.c > ... > > @@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error); > ... > > + > > + /* > > + * Now try to fill the bnobt/cntbt cursors with extra blocks to > > + * populate the AGFL. If we don't get all the blocks we want, > > + * stop trying to fill the AGFL. > > + */ > > + wanted = (int64_t)btr_bno->bload.nr_blocks + > > + (min_agfl_len / 2) - bno_blocks; > > + if (wanted > 0 && fill_agfl) { > > + got = reserve_agblocks(sc->mp, agno, btr_bno, wanted); > > + if (wanted > got) > > + fill_agfl = false; > > + btr_bno->bload.nr_blocks += got; > > + } > > + > > + wanted = (int64_t)btr_cnt->bload.nr_blocks + > > + (min_agfl_len / 2) - cnt_blocks; > > + if (wanted > 0 && fill_agfl) { > > + got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted); > > + if (wanted > got) > > + fill_agfl = false; > > + btr_cnt->bload.nr_blocks += got; > > + } > > It's a little hard to follow this with the nr_blocks sampling and > whatnot, but I think I get the idea. What's the reason for splitting the > AGFL res requirement evenly across the two cursors? These AGFL blocks > all fall into the same overflow pool, right? I was wondering why we > couldn't just attach the overflow to one, or check one for the full res > and then the other if more blocks are needed. I chose to stuff the excess blocks into the bnobt and cntbt bulkload cursors to avoid having to initialize a semi-phony "bulkload cursor" for the agfl, and I decided to split them evenly between the two cursors so that I wouldn't have someday to deal with a bug report about how one cursor somehow ran out of blocks but the other one had plenty more. > In thinking about it a bit more, wouldn't the whole algorithm be more > simple if we reserved the min AGFL requirement first, optionally passed > 'agfl_res' to reserve_btblocks() such that subsequent reservations can > steal from it (and then fail if it depletes), then stuff what's left in > one (or both, if there's a reason for that) of the cursors at the end? Hmm. I hadn't thought about that. In general I wanted the AGFL reservations to go last because I'd rather we set off with an underfull AGFL than totally fail because we couldn't get blocks for the bnobt/cntbt, but I suppose you're right that we could steal from it as needed to prevent repair failure. So, uh, I could rework this patch to create a phony agfl bulk load cursor, fill it before the loop, steal blocks from it to fill the bnobt/cntbt to satisfy failed allocations, and then dump any remainders into the bnobt/cntbt cursors afterwards. How does that sound? --D > Brian > > > > > /* Ok, now how many free space records do we have? */ > > *nr_extents = count_bno_extents_blocks(agno, &num_freeblocks); > > } while (1); > > - > > - *extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) + > > - (cnt_blocks - btr_cnt->bload.nr_blocks); > > } > > > > /* Rebuild the free space btrees. */ > > >