On Wed, Mar 31, 2021 at 09:18:58AM +1100, Dave Chinner wrote: > On Tue, Mar 30, 2021 at 10:25:24PM +0800, Gao Xiang wrote: > > From: Gao Xiang <hsiangkao@xxxxxxxxxx> > > > > Just use array and reallocate one-by-one here (not sure if bulk > > allocation is more effective or not.) > > Did you profile repairing a filesystem with lots of broken > directories? Optimisations like this really need to be profile > guided and the impact docuemnted. That way reviewers can actually > see the benefit the change brings to the table.... Nope, will do then (since I'm not confident with the target performance tbh.) > > > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx> > > --- > > repair/dir2.c | 34 +++++++++++++++++----------------- > > repair/dir2.h | 2 +- > > 2 files changed, 18 insertions(+), 18 deletions(-) > > > > diff --git a/repair/dir2.c b/repair/dir2.c > > index eabdb4f2d497..b6a8a5c40ae4 100644 > > --- a/repair/dir2.c > > +++ b/repair/dir2.c > > @@ -20,40 +20,40 @@ > > * Known bad inode list. These are seen when the leaf and node > > * block linkages are incorrect. > > */ > > -typedef struct dir2_bad { > > - xfs_ino_t ino; > > - struct dir2_bad *next; > > -} dir2_bad_t; > > +struct dir2_bad { > > + unsigned int nr; > > + xfs_ino_t *itab; > > +}; > > > > -static dir2_bad_t *dir2_bad_list; > > +static struct dir2_bad dir2_bad; > > > > static void > > dir2_add_badlist( > > xfs_ino_t ino) > > { > > - dir2_bad_t *l; > > + xfs_ino_t *itab; > > > > - if ((l = malloc(sizeof(dir2_bad_t))) == NULL) { > > + itab = realloc(dir2_bad.itab, (dir2_bad.nr + 1) * sizeof(xfs_ino_t)); > > + if (!itab) { > > do_error( > > _("malloc failed (%zu bytes) dir2_add_badlist:ino %" PRIu64 "\n"), > > - sizeof(dir2_bad_t), ino); > > + sizeof(xfs_ino_t), ino); > > exit(1); > > } > > - l->next = dir2_bad_list; > > - dir2_bad_list = l; > > - l->ino = ino; > > + itab[dir2_bad.nr++] = ino; > > + dir2_bad.itab = itab; > > } > > > > -int > > +bool > > dir2_is_badino( > > xfs_ino_t ino) > > { > > - dir2_bad_t *l; > > + unsigned int i; > > > > - for (l = dir2_bad_list; l; l = l->next) > > - if (l->ino == ino) > > - return 1; > > - return 0; > > + for (i = 0; i < dir2_bad.nr; ++i) > > + if (dir2_bad.itab[i] == ino) > > + return true; > > + return false; > > This ignores the problem with this code: it is a O(n * N) search > that gets done under an exclusive lock. Changing this to an array > doesn't improve the efficiency of the algorithm at all. It might > slighty reduce the magnitude of N, but modern CPU prefetchers detect > link list walks like this so are almost as fast sequentail array > walks. Hence this change will gain us relatively little when we have > millions of bad inodes to search. > > IOWs, the scalability problem that needs to be solved here is not > "replace a linked list", it is "replace an O(n * N) search > algorithm". We should address the algorithmic problem, not the code > implementation issue. > > That means we need to replace the linear search with a different > algorithm, not rework the data structure used to do a linear search. > We want this search to be reduced to O(n * log N) or O(n). We really > don't care about memory usage or even the overhead of per-object > memory allocation - we already do that and it isn't a performance > limitation, so optimising for memory allocation reductions is > optimising the wrong thing. > > Replacing the linked list with an AVL tree or radix tree will make > the search O(log N), giving us the desired reduction in search > overhead to O(n * log N) and, more importantly, a significant > reduction in lock hold times. Obviously, I agree with your idea. Radix tree indexed by inode number may waste space. I didn't notice that repair code had some AVL infrastructure, let me try to use AVL instead. But anyway, I'm not sure if it's a critical problem and if users really came into bad inode bomb. Use array here just because some previous review suggestion. Thanks, Gao Xiang > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >