Re: [PATCH] unpack-trees: fix accidentally quadratic behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Turner <dturner@xxxxxxxxxxxxxxxx> writes:

> While unpacking trees (e.g. during git checkout), when we hit a cache
> entry that's past and outside our path, we cut off iteration.
>
> This provides about a 45% speedup on git checkout between master and
> master^20000 on Twitter's monorepo.  Speedup in general will depend on
> repostitory structure, number of changes, and packfile packing
> decisions.
>
> Signed-off-by: David Turner <dturner@xxxxxxxxxxxxxxxx>
> ---

I haven't thought things through, but does this get fooled by the
somewhat strange ordering rules of tree entries (i.e. a subtree
sorts as if its name is suffixed with a '/' in a tree object)?

Other than that, I like this.  "We know the list is sorted, and
after seeing this entry we know there is nothing that will match" is
an obvious optimization that we already use elsewhere.

Thanks.

>  unpack-trees.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 5f541c2..b18a611 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -695,8 +695,25 @@ static int find_cache_pos(struct traverse_info *info,
>  				++o->cache_bottom;
>  			continue;
>  		}
> -		if (!ce_in_traverse_path(ce, info))
> +		if (!ce_in_traverse_path(ce, info)) {
> +			/*
> +			 * Check if we can skip future cache checks
> +			 * (because we're already past all possible
> +			 * entries in the traverse path).
> +			 */
> +			if (info->prev && info->traverse_path) {
> +				int prefix_cmp = strncmp(ce->name, info->traverse_path, info->pathlen);
> +				if (prefix_cmp > 0)
> +					break;
> +				else if (prefix_cmp == 0 &&
> +					 ce_namelen(ce) >= info->pathlen &&
> +					 strcmp(ce->name + info->pathlen,
> +						 info->name.path) > 0) {
> +					break;
> +				}
> +			}
>  			continue;
> +		}
>  		ce_name = ce->name + pfxlen;
>  		ce_slash = strchr(ce_name, '/');
>  		if (ce_slash)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]