Re: [PATCH v3 1/8] repair: turn bad inode list into array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 31, 2021 at 09:18:58AM +1100, Dave Chinner wrote:
> On Tue, Mar 30, 2021 at 10:25:24PM +0800, Gao Xiang wrote:
> > From: Gao Xiang <hsiangkao@xxxxxxxxxx>
> > 
> > Just use array and reallocate one-by-one here (not sure if bulk
> > allocation is more effective or not.)
> 
> Did you profile repairing a filesystem with lots of broken
> directories? Optimisations like this really need to be profile
> guided and the impact docuemnted. That way reviewers can actually
> see the benefit the change brings to the table....

Nope, will do then (since I'm not confident with the target
performance tbh.)

> 
> > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx>
> > ---
> >  repair/dir2.c | 34 +++++++++++++++++-----------------
> >  repair/dir2.h |  2 +-
> >  2 files changed, 18 insertions(+), 18 deletions(-)
> > 
> > diff --git a/repair/dir2.c b/repair/dir2.c
> > index eabdb4f2d497..b6a8a5c40ae4 100644
> > --- a/repair/dir2.c
> > +++ b/repair/dir2.c
> > @@ -20,40 +20,40 @@
> >   * Known bad inode list.  These are seen when the leaf and node
> >   * block linkages are incorrect.
> >   */
> > -typedef struct dir2_bad {
> > -	xfs_ino_t	ino;
> > -	struct dir2_bad	*next;
> > -} dir2_bad_t;
> > +struct dir2_bad {
> > +	unsigned int	nr;
> > +	xfs_ino_t	*itab;
> > +};
> >  
> > -static dir2_bad_t *dir2_bad_list;
> > +static struct dir2_bad	dir2_bad;
> >  
> >  static void
> >  dir2_add_badlist(
> >  	xfs_ino_t	ino)
> >  {
> > -	dir2_bad_t	*l;
> > +	xfs_ino_t	*itab;
> >  
> > -	if ((l = malloc(sizeof(dir2_bad_t))) == NULL) {
> > +	itab = realloc(dir2_bad.itab, (dir2_bad.nr + 1) * sizeof(xfs_ino_t));
> > +	if (!itab) {
> >  		do_error(
> >  _("malloc failed (%zu bytes) dir2_add_badlist:ino %" PRIu64 "\n"),
> > -			sizeof(dir2_bad_t), ino);
> > +			sizeof(xfs_ino_t), ino);
> >  		exit(1);
> >  	}
> > -	l->next = dir2_bad_list;
> > -	dir2_bad_list = l;
> > -	l->ino = ino;
> > +	itab[dir2_bad.nr++] = ino;
> > +	dir2_bad.itab = itab;
> >  }
> >  
> > -int
> > +bool
> >  dir2_is_badino(
> >  	xfs_ino_t	ino)
> >  {
> > -	dir2_bad_t	*l;
> > +	unsigned int i;
> >  
> > -	for (l = dir2_bad_list; l; l = l->next)
> > -		if (l->ino == ino)
> > -			return 1;
> > -	return 0;
> > +	for (i = 0; i < dir2_bad.nr; ++i)
> > +		if (dir2_bad.itab[i] == ino)
> > +			return true;
> > +	return false;
> 
> This ignores the problem with this code: it is a O(n * N) search
> that gets done under an exclusive lock. Changing this to an array
> doesn't improve the efficiency of the algorithm at all. It might
> slighty reduce the magnitude of N, but modern CPU prefetchers detect
> link list walks like this so are almost as fast sequentail array
> walks. Hence this change will gain us relatively little when we have
> millions of bad inodes to search.
> 
> IOWs, the scalability problem that needs to be solved here is not
> "replace a linked list", it is "replace an O(n * N) search
> algorithm". We should address the algorithmic problem, not the code
> implementation issue.
> 
> That means we need to replace the linear search with a different
> algorithm, not rework the data structure used to do a linear search.
> We want this search to be reduced to O(n * log N) or O(n). We really
> don't care about memory usage or even the overhead of per-object
> memory allocation - we already do that and it isn't a performance
> limitation, so optimising for memory allocation reductions is
> optimising the wrong thing.
> 
> Replacing the linked list with an AVL tree or radix tree will make
> the search O(log N), giving us the desired reduction in search
> overhead to O(n * log N) and, more importantly, a significant
> reduction in lock hold times.

Obviously, I agree with your idea. Radix tree indexed by inode number
may waste space. I didn't notice that repair code had some AVL
infrastructure, let me try to use AVL instead.

But anyway, I'm not sure if it's a critical problem and if users really
came into bad inode bomb. Use array here just because some previous
review suggestion.

Thanks,
Gao Xiang

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux