On Wed, May 16, 2018 at 10:34:36AM -0700, Allison Henderson wrote: > Ok, with the points Dave made: > Reviewed by: Allison Henderson <allison.henderson@xxxxxxxxxx> > > On 05/16/2018 12:56 AM, Dave Chinner wrote: > > On Tue, May 15, 2018 at 03:33:58PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > > > Add some helpers to assemble a list of fs block extents. Generally, > > > repair functions will iterate the rmapbt to make a list (1) of all > > > extents owned by the nominal owner of the metadata structure; then they > > > will iterate all other structures with the same rmap owner to make a > > > list (2) of active blocks; and finally we have a subtraction function to > > > subtract all the blocks in (2) from (1), with the result that (1) is now > > > a list of blocks that were owned by the old btree and must be disposed. > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > --- > > > fs/xfs/scrub/repair.c | 207 +++++++++++++++++++++++++++++++++++++++++++++++++ > > > fs/xfs/scrub/repair.h | 31 +++++++ > > > 2 files changed, 238 insertions(+) > > > > > > > > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c > > > index 72f04a717150..8e8ecddd7537 100644 > > > --- a/fs/xfs/scrub/repair.c > > > +++ b/fs/xfs/scrub/repair.c > > > @@ -361,3 +361,210 @@ xfs_repair_init_btblock( > > > return 0; > > > } > > > + > > > +/* Collect a dead btree extent for later disposal. */ > > > +int > > > +xfs_repair_collect_btree_extent( > > > + struct xfs_scrub_context *sc, > > > + struct xfs_repair_extent_list *exlist, > > > + xfs_fsblock_t fsbno, > > > + xfs_extlen_t len) > > > +{ > > > + struct xfs_repair_extent *rex; > > > + > > > + trace_xfs_repair_collect_btree_extent(sc->mp, > > > + XFS_FSB_TO_AGNO(sc->mp, fsbno), > > > + XFS_FSB_TO_AGBNO(sc->mp, fsbno), len); > > > + > > > + rex = kmem_alloc(sizeof(struct xfs_repair_extent), KM_MAYFAIL); > > > + if (!rex) > > > + return -ENOMEM; > > Is this in transaction context? Regardless, I think we need to run > > the entire of scrub/repair in a memalloc_nofs_save() context so we > > don't have memory reclaim recursion issues... > > > > [...] > > > > > +/* Compare two btree extents. */ > > > +static int > > > +xfs_repair_btree_extent_cmp( > > > + void *priv, > > > + struct list_head *a, > > > + struct list_head *b) > > > +{ > > > + struct xfs_repair_extent *ap; > > > + struct xfs_repair_extent *bp; > > > + > > > + ap = container_of(a, struct xfs_repair_extent, list); > > > + bp = container_of(b, struct xfs_repair_extent, list); > > > + > > > + if (ap->fsbno > bp->fsbno) > > > + return 1; > > > + else if (ap->fsbno < bp->fsbno) > > > + return -1; > > No need for the else there. > Well, I think he meant to return 0 in the case of ap->fsbno == bp->fsbno? > Am i reading that right? caller expects 1 for greater than, -1 for less > than and 0 on equivalence? Correct. I think Dave was pointing out that else-after-return is unnecessary, since the following behaves equivalently: if (a > b) return 1; if (a < b) return -1; return 0; Note that gcc (7.3, anyway) generates the same asm for either version so I assume this is mostly stylistic cleanup to make the comparisons line up? I don't have a preference either way. :) --D > > > + return 0; > > > +} > > > + > > > +/* > > > + * Remove all the blocks mentioned in sublist from the extents in exlist. > > > + * > > > + * The intent is that callers will iterate the rmapbt for all of its records > > > + * for a given owner to generate exlist; and iterate all the blocks of the > > generate @exlist > > > > > + * metadata structures that are not being rebuilt and have the same rmapbt > > > + * owner to generate sublist. This routine subtracts all the extents > > generate @sublist. > > > > > + * mentioned in sublist from all the extents linked in exlist, which leaves > > > + * exlist as the list of blocks that are not accounted for, which we assume > > > + * are the dead blocks of the old metadata structure. The blocks mentioned in > > > + * exlist can be reaped. > > > + */ > > > +#define XFS_REPAIR_EXT_LEFT_CONTIG (1 << 0) > > > +#define XFS_REPAIR_EXT_RIGHT_CONTIG (1 << 1) > > > +int > > > +xfs_repair_subtract_extents( > > > + struct xfs_scrub_context *sc, > > > + struct xfs_repair_extent_list *exlist, > > > + struct xfs_repair_extent_list *sublist) > > > +{ > > > + struct list_head *lp; > > > + struct xfs_repair_extent *ex; > > > + struct xfs_repair_extent *newex; > > > + struct xfs_repair_extent *subex; > > > + xfs_fsblock_t sub_fsb; > > > + xfs_extlen_t sub_len; > > > + int state; > > > + int error = 0; > > > + > > > + if (list_empty(&exlist->list) || list_empty(&sublist->list)) > > > + return 0; > > > + ASSERT(!list_empty(&sublist->list)); > > > + > > > + list_sort(NULL, &exlist->list, xfs_repair_btree_extent_cmp); > > > + list_sort(NULL, &sublist->list, xfs_repair_btree_extent_cmp); > > > + > > > + subex = list_first_entry(&sublist->list, struct xfs_repair_extent, > > > + list); > > > + lp = exlist->list.next; > > > + while (lp != &exlist->list) { > > > + ex = list_entry(lp, struct xfs_repair_extent, list); > > > + > > > + /* > > > + * Advance subex and/or ex until we find a pair that > > > + * intersect or we run out of extents. > > > + */ > > > + while (subex->fsbno + subex->len <= ex->fsbno) { > > > + if (list_is_last(&subex->list, &sublist->list)) > > > + goto out; > > > + subex = list_next_entry(subex, list); > > > + } > > So this is a O(n^2) algorithm, right? How does it scale with large > > extent lists? Given that these extents are dynamically allocated, > > and we're already burning 16 bytes for a list head on each extent, > > would it be better to use a smarter structure better suited for > > exact lookups? e.g. an rbtree only takes an extra 8 bytes per > > extent, and we get O(log N) searches on the inner loop here... > > > > I guess this isn't necessary to fix right now, but I think it's > > going to be an issue for maybe mark this down as "needing to be > > fixed before removing EXPERIMENTAL tags"? > > > > > + if (subex->fsbno >= ex->fsbno + ex->len) { > > > + lp = lp->next; > > > + continue; > > > + } > > > + > > > + /* trim subex to fit the extent we have */ > > > + sub_fsb = subex->fsbno; > > > + sub_len = subex->len; > > > + if (subex->fsbno < ex->fsbno) { > > > + sub_len -= ex->fsbno - subex->fsbno; > > > + sub_fsb = ex->fsbno; > > > + } > > > + if (sub_len > ex->len) > > > + sub_len = ex->len; > > > + > > > + state = 0; > > > + if (sub_fsb == ex->fsbno) > > > + state |= XFS_REPAIR_EXT_LEFT_CONTIG; > > > + if (sub_fsb + sub_len == ex->fsbno + ex->len) > > > + state |= XFS_REPAIR_EXT_RIGHT_CONTIG; > > Ok, I think "CONTIG" is not the right word to use here. In the BMAP > > btrees, the merge state flags were to tell us whether the edge of > > the new extent is contiguous with the left and right extents in > > the tree, not whether the new extents overlapped to the left/right > > edges. > > > > i.e. we're checking whether extent start/end overlaps are aligned > > here, not whether they are contiguous with some other extent. So in > > this case, I'd just name the variables "LEFT_ALIGNED" and > > "RIGHT_ALIGNED" and drop all the namespace bits from them. > > > > > + switch (state) { > > > + case XFS_REPAIR_EXT_LEFT_CONTIG: > > > + /* Coincides with only the left. */ > > And by calling them aligned, the comments become redundant: > > > > case LEFT_ALIGNED: > > > + ex->fsbno += sub_len; > > > + ex->len -= sub_len; > > > + break; > > > + case XFS_REPAIR_EXT_RIGHT_CONTIG: > > > + /* Coincides with only the right. */ > > > + ex->len -= sub_len; > > > + lp = lp->next; > > > + break; > > > + case XFS_REPAIR_EXT_LEFT_CONTIG | XFS_REPAIR_EXT_RIGHT_CONTIG: > > > + /* Total overlap, just delete ex. */ > > > + lp = lp->next; > > > + list_del(&ex->list); > > > + kmem_free(ex); > > > + break; > > > + case 0: > > > + /* > > > + * Deleting from the middle: add the new right extent > > > + * and then shrink the left extent. > > > + */ > > > + newex = kmem_alloc(sizeof(struct xfs_repair_extent), > > > + KM_MAYFAIL); > > > + if (!newex) { > > > + error = -ENOMEM; > > > + goto out; > > > + } > > > + INIT_LIST_HEAD(&newex->list); > > > + newex->fsbno = sub_fsb + sub_len; > > > + newex->len = ex->len - (sub_fsb - ex->fsbno) - sub_len; > > so: new len = old len - (length of first extent) - length of overlap. > > > > I think this is more obvious as "new len = old end - new start", i.e.: > > > > newex->len = ex->fsbno + ex->len - newex->fsbno; > > > > Cheers, > > > > Dave. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html