Re: [PATCH 1/2] sha1_name: try to use same abbrev length when core.abbrevguard is specified

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 09 Mar 2011 11:12:00 -0800

Namhyung Kim <namhyung@xxxxxxxxx> writes:

> If find_unique_abbrev() finds a ambiguous SHA1 name, it tries
> to find again with increased length. In this case, result hex
> strings could have different lengths even though the
> core.abbrevguard config option is specified. But if the option
> is specified and increased length (delta) is less than its
> value, the result could be adjusted to the same length.

I am not sure if I can understand what problem you are trying to solve
from the above description.

The function is given "len" from the caller to specify the minimum length
of the output the caller expects (i.e. even if 4 hexdigits is enough to
identify the given commit in a small project, the caller can say it wants
to see at least 7 hexdigits).  The loop without your patch finds the
shortest prefix whose length is at least that given length that uniquely
identifies the given object (or the shortest prefix that doesn't identify
any existing object if the given sha1 does not exist in the repository).
And then ensures the returned value is longer by the guard as an extra
safety measure, so that later when the project grows, the disambiguation
we find today has a better chance to survive.

With this patch, the loop decreases the length of the guard when "len"
given by the caller is insufficient to ensure uniqueness, which does not
sound right.

Suppose the given object has ambiguous other objects and you need 8
hexdigits at least to make it unique in today's history.  The caller gives
you len of 7, and the guard is set to 3.

With the original code, the loop starts with 7, finds that it is not
long enough to disambiguate, increments and retries, finds that 8 is the
shortest prefix, and then adds the guard and returns 11 hexdigits.

With your patch, the loop starts with 7 with extra set to 3, finds that 7
is not long enough and decrements extra to 2, finds that 8 is the shortest
prefix, and then returns only 10 hexdigits.

Which feels like totally going against the reason why we added the guard.

What am I missing?

>  sha1_name.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/sha1_name.c b/sha1_name.c
> index 709ff2e..6bb8942 100644
> --- a/sha1_name.c
> +++ b/sha1_name.c
> @@ -197,6 +197,7 @@ const char *find_unique_abbrev(const unsigned char *sha1, int len)
>  {
>  	int status, exists;
>  	static char hex[41];
> +	int extra_len = unique_abbrev_extra_length;
>  
>  	exists = has_sha1_file(sha1);
>  	memcpy(hex, sha1_to_hex(sha1), 40);
> @@ -208,12 +209,14 @@ const char *find_unique_abbrev(const unsigned char *sha1, int len)
>  		if (exists
>  		    ? !status
>  		    : status == SHORT_NAME_NOT_FOUND) {
> -			int cut_at = len + unique_abbrev_extra_length;
> +			int cut_at = len + extra_len;
>  			cut_at = (cut_at < 40) ? cut_at : 40;
>  			hex[cut_at] = 0;
>  			return hex;
>  		}
>  		len++;
> +		if (extra_len > 0)
> +			extra_len--;
>  	}
>  	return hex;
>  }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html