Re: [PATCH v3 1/8] repair: turn bad inode list into array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 30, 2021 at 10:25:24PM +0800, Gao Xiang wrote:
> From: Gao Xiang <hsiangkao@xxxxxxxxxx>
> 
> Just use array and reallocate one-by-one here (not sure if bulk
> allocation is more effective or not.)

Did you profile repairing a filesystem with lots of broken
directories? Optimisations like this really need to be profile
guided and the impact docuemnted. That way reviewers can actually
see the benefit the change brings to the table....

> Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx>
> ---
>  repair/dir2.c | 34 +++++++++++++++++-----------------
>  repair/dir2.h |  2 +-
>  2 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/repair/dir2.c b/repair/dir2.c
> index eabdb4f2d497..b6a8a5c40ae4 100644
> --- a/repair/dir2.c
> +++ b/repair/dir2.c
> @@ -20,40 +20,40 @@
>   * Known bad inode list.  These are seen when the leaf and node
>   * block linkages are incorrect.
>   */
> -typedef struct dir2_bad {
> -	xfs_ino_t	ino;
> -	struct dir2_bad	*next;
> -} dir2_bad_t;
> +struct dir2_bad {
> +	unsigned int	nr;
> +	xfs_ino_t	*itab;
> +};
>  
> -static dir2_bad_t *dir2_bad_list;
> +static struct dir2_bad	dir2_bad;
>  
>  static void
>  dir2_add_badlist(
>  	xfs_ino_t	ino)
>  {
> -	dir2_bad_t	*l;
> +	xfs_ino_t	*itab;
>  
> -	if ((l = malloc(sizeof(dir2_bad_t))) == NULL) {
> +	itab = realloc(dir2_bad.itab, (dir2_bad.nr + 1) * sizeof(xfs_ino_t));
> +	if (!itab) {
>  		do_error(
>  _("malloc failed (%zu bytes) dir2_add_badlist:ino %" PRIu64 "\n"),
> -			sizeof(dir2_bad_t), ino);
> +			sizeof(xfs_ino_t), ino);
>  		exit(1);
>  	}
> -	l->next = dir2_bad_list;
> -	dir2_bad_list = l;
> -	l->ino = ino;
> +	itab[dir2_bad.nr++] = ino;
> +	dir2_bad.itab = itab;
>  }
>  
> -int
> +bool
>  dir2_is_badino(
>  	xfs_ino_t	ino)
>  {
> -	dir2_bad_t	*l;
> +	unsigned int i;
>  
> -	for (l = dir2_bad_list; l; l = l->next)
> -		if (l->ino == ino)
> -			return 1;
> -	return 0;
> +	for (i = 0; i < dir2_bad.nr; ++i)
> +		if (dir2_bad.itab[i] == ino)
> +			return true;
> +	return false;

This ignores the problem with this code: it is a O(n * N) search
that gets done under an exclusive lock. Changing this to an array
doesn't improve the efficiency of the algorithm at all. It might
slighty reduce the magnitude of N, but modern CPU prefetchers detect
link list walks like this so are almost as fast sequentail array
walks. Hence this change will gain us relatively little when we have
millions of bad inodes to search.

IOWs, the scalability problem that needs to be solved here is not
"replace a linked list", it is "replace an O(n * N) search
algorithm". We should address the algorithmic problem, not the code
implementation issue.

That means we need to replace the linear search with a different
algorithm, not rework the data structure used to do a linear search.
We want this search to be reduced to O(n * log N) or O(n). We really
don't care about memory usage or even the overhead of per-object
memory allocation - we already do that and it isn't a performance
limitation, so optimising for memory allocation reductions is
optimising the wrong thing.

Replacing the linked list with an AVL tree or radix tree will make
the search O(log N), giving us the desired reduction in search
overhead to O(n * log N) and, more importantly, a significant
reduction in lock hold times.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux