On Wed, May 16, 2018 at 11:37:29AM -0700, Darrick J. Wong wrote: > On Wed, May 16, 2018 at 06:51:52PM +1000, Dave Chinner wrote: > > On Tue, May 15, 2018 at 03:34:10PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > > > Add a helper function to help us recover btree roots from the rmap data. > > > Callers pass in a list of rmap owner codes, buffer ops, and magic > > > numbers. We iterate the rmap records looking for owner matches, and > > > then read the matching blocks to see if the magic number & uuid match. > > > If so, we then read-verify the block, and if that passes then we retain > > > a pointer to the block with the highest level, assuming that by the end > > > of the call we will have found the root. This will be used to reset the > > > AGF/AGI btree root fields during their rebuild procedures. > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> ..... > > > + /* Ignore this block if it's lower in the tree than we've seen. */ > > > + if (fab->root != NULLAGBLOCK && > > > + xfs_btree_get_level(btblock) < fab->height) > > > + goto out; > > > + > > > + /* Make sure we pass the verifiers. */ > > > + bp->b_ops->verify_read(bp); > > > + if (bp->b_error) > > > + goto out; > > > + fab->root = agbno; > > > + fab->height = xfs_btree_get_level(btblock) + 1; > > > + *found_it = true; > > > + > > > + trace_xfs_repair_findroot_block(mp, ri->sc->sa.agno, agbno, > > > + be32_to_cpu(btblock->bb_magic), fab->height - 1); > > > +out: > > > + xfs_trans_brelse(ri->sc->tp, bp); > > > > So we release the buffer once we've found it, which also unlocks it. > > That means when we come back to it later, it may have been accessed > > and changed by something else and no longer be the block we are > > looking for. How do you protect against this sort of race given we > > are unlocking the buffer? Perhaps it should be held on the fab > > structure, and released when a better candidate is found? > > The two callers of this function are the AGF and AGI repair functions. > AGF repair holds the locked AGF buffer, and AGI repair holds the locked > AGF & AGI buffers, which should be enough to prevent anyone else from > accessing the AG btrees. They keep the all the AG header buffers locked > until they're completely finished with rebuilding the headers (i.e. > xfs_scrub_teardown) and it's safe for the shape to change. > > How about I add to the comment for this function: > > /* > * The caller must lock the applicable per-AG header buffers (AGF, AGI) > * to prevent other threads from changing the shape of the btrees that > * we are looking for. It must maintain those locks until it's safe for > * other threads to change the btrees' shapes. > */ That's helpful. :) Can you sprinkle some checks like ASSERT(xfs_buf_islocked(agbp)) to remind readers of the leaf/callback functions that they expect the AGF/AGI to be locked on entry? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html