Re: [PATCH] unpack-trees: fix accidentally quadratic behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Turner <dturner@xxxxxxxxxxxxxxxx> writes:

> On Wed, 2016-01-20 at 20:58 -0800, Junio C Hamano wrote:
>> David Turner <dturner@xxxxxxxxxxxxxxxx> writes:
>> 
>> > While unpacking trees (e.g. during git checkout), when we hit a
>> > cache
>> > entry that's past and outside our path, we cut off iteration.
>> > 
>> > This provides about a 45% speedup on git checkout between master
>> > and
>> > master^20000 on Twitter's monorepo.  Speedup in general will depend
>> > on
>> > repostitory structure, number of changes, and packfile packing
>> > decisions.
>> > 
>> > Signed-off-by: David Turner <dturner@xxxxxxxxxxxxxxxx>
>> > ---
>> 
>> I haven't thought things through, but does this get fooled by the
>> somewhat strange ordering rules of tree entries (i.e. a subtree
>> sorts as if its name is suffixed with a '/' in a tree object)?
>> 
>> Other than that, I like this.  "We know the list is sorted, and
>> after seeing this entry we know there is nothing that will match" is
>> an obvious optimization that we already use elsewhere.
>> 
>> Thanks.
>
> I think this is correct, because we first do the more complicated check
> (ce_in_traverse_path), and only check the ordering once that has
> failed.  

But the patch does this:

> +			if (info->prev && info->traverse_path) {
> +				int prefix_cmp = strncmp(ce->name, info->traverse_path, info->pathlen);
> +				if (prefix_cmp > 0)
> +					break;
> +				else if (prefix_cmp == 0 &&
> +					 ce_namelen(ce) >= info->pathlen &&
> +					 strcmp(ce->name + info->pathlen,
> +						 info->name.path) > 0) {
> +					break;
> +				}
> +			}
>  			continue;

The first break is correct, but I am not sure about the "else if"
part.  Shouldn't it be doing something similar to the logic to "keep
looking" that talks about "t-i", "t" and "t/a" at the end of the
loop?

> The tests all pass, so this should be good.

Please don't ever say that again.  The existing tests do not cover
all corner cases, and they fundamentally cannot cover the corner
cases future changes may introduce.

Saying "Even test X breaks, so it is clearly broken.  Why did you
send it out without testing?" is OK, though.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]