Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 04, 2017 at 03:07:25PM +0900, Junio C Hamano wrote:

> > -	exists = has_sha1_file(sha1);
> > -	while (len < GIT_SHA1_HEXSZ) {
> > -		struct object_id oid_ret;
> > -		status = get_short_oid(hex, len, &oid_ret, GET_OID_QUIETLY);
> > -		if (exists
> > -		    ? !status
> > -		    : status == SHORT_NAME_NOT_FOUND) {
> > -			hex[len] = 0;
> > -			return len;
> > -		}
> > -		len++;
> > -	}
> > -	return len;
> 
> The "always_call_fn" thing is a big sledgehammer that overrides
> everything else in update_candidates().  It bypasses the careful
> machinery set up to avoid having to open ambiguous object to learn
> their types as much as possible.  One narrow exception when it is OK
> to use is if we never limit our candidates with type.
> 
> And it might appear that the conversion is safe (if only because we
> do not see any type limitation in the get_short_oid() call above),
> but I think there is one case where this patch changes the
> behaviour: what happens if core.disambiguate was set to anything
> other than "none"?  The new code does not know anything about type
> based filtering, so it can end up reporting longer abbreviation than
> it was asked to produce.  It may not be a problem in practice, though.
> 
> I am not sure if setting core.disambiguate is generally a good idea
> in the first place, and if it is OK to break find_unique_abbrev()
> with respect to the configuration variable like this patch does.
> 
> I'd feel safe if we get extra input from Peff, who introduced the
> feature in 5b33cb1f ("get_short_sha1: make default disambiguation
> configurable", 2016-09-27).

Regarding core.disambiguate, I _do_ think it's reasonable to set it to
"commit" or "committish". And in fact I have meant to revisit the idea
of doing so by default (the reason it was made into config at all was to
let people play around with it and gain experience).

That said, I think it's entirely reasonable for find_unique_abbrev() to
ignore type-based disambiguation entirely.

The type disambiguation is really a property of the context in which we
do a lookup. And that context is not necessarily known to the generating
side. Even core.disambiguate is not universal, as command-specific
context overrides it.

So I think on the generating side we are better off creating a slightly
longer abbreviation that is unambiguous no matter what context it is
used in. I.e., I'd argue that it's actually more _correct_ to ignore
the disambiguation code entirely on the generating side.

And it should also be faster, because it turns the abbreviation search
into a purely textual one that never has to look at extra objects. And
that speed matters a lot more on the generating side, where we tend to
output long lists of abbreviated sha1s in commands like "git log" (as
opposed to the lookup side, where we're asked to find some particular
item of interest).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux