Re: [PATCH] name-rev: prefer shorter names over following merges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elijah,

On Sat, 13 Nov 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@xxxxxxxxx>
>
> name-rev has a MERGE_TRAVERSAL_WEIGHT to say that traversing a second or
> later parent of a merge should be 65535 times more expensive than a
> first-parent traversal, as per ac076c29ae8d (name-rev: Fix non-shortest
> description, 2007-08-27).  The point of this weight is to prefer names
> like
>
>     v2.32.0~1471^2
>
> over names like
>
>     v2.32.0~43^2~15^2~11^2~20^2~31^2
>
> which are two equally valid names in git.git for the same commit.  Note
> that the first follows 1472 parent traversals compared to a mere 125 for
> the second.  Weighting all traversals equally would clearly prefer the
> second name since it has fewer parent traversals, but humans aren't
> going to be traversing commits and they tend to have an easier time
> digesting names with fewer segments.  The fact that the former only has
> two segments (~1471, ^2) makes it much simpler than the latter which has
> six segments (~43, ^2, ~15, etc.).  Since name-rev is meant to "find
> symbolic names suitable for human digestion", we prefer fewer segments.
>
> However, the particular rule implemented in name-rev would actually
> prefer
>
>     v2.33.0-rc0~11^2~1
>
> over
>
>     v2.33.0-rc0~20^2
>
> because both have precisely one second parent traversal, and it gives
> the tie breaker to shortest number of total parent traversals.  Fewer
> segments is more important for human consumption than number of hops, so
> we'd rather see the latter which has one fewer segment.
>
> Include the generation in is_better_name() and use a new
> effective_distance() calculation so that we prefer fewer segments in
> the printed name over fewer total parent traversals performed to get the
> answer.

Thank you. As you most likely figured out, that magic weight was
introduced by me, in ac076c29ae8 (name-rev: Fix non-shortest description,
2007-08-27). And indeed the motivation was to keep the name as short as
possible.

Technically, your solution does not fix the problem fully, as we still do
not determine the _shortest possible_ name. Having said that, I think your
patch improves the situation dramatically, so: ACK!

Thanks,
Dscho

>
> Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> ---
>     name-rev: prefer shorter names over following merges
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1119%2Fnewren%2Fprefer-shorter-names-in-name-rev-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1119/newren/prefer-shorter-names-in-name-rev-v1
> Pull-Request: https://github.com/git/git/pull/1119
>
>  builtin/name-rev.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index b221d300147..27f60153a6c 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -44,11 +44,20 @@ static struct rev_name *get_commit_rev_name(const struct commit *commit)
>  	return is_valid_rev_name(name) ? name : NULL;
>  }
>
> +static int effective_distance(int distance, int generation)
> +{
> +	return distance + (generation > 0 ? MERGE_TRAVERSAL_WEIGHT : 0);
> +}
> +
>  static int is_better_name(struct rev_name *name,
>  			  timestamp_t taggerdate,
> +			  int generation,
>  			  int distance,
>  			  int from_tag)
>  {
> +	int name_distance = effective_distance(name->distance, name->generation);
> +	int new_distance = effective_distance(distance, generation);
> +
>  	/*
>  	 * When comparing names based on tags, prefer names
>  	 * based on the older tag, even if it is farther away.
> @@ -56,7 +65,7 @@ static int is_better_name(struct rev_name *name,
>  	if (from_tag && name->from_tag)
>  		return (name->taggerdate > taggerdate ||
>  			(name->taggerdate == taggerdate &&
> -			 name->distance > distance));
> +			 name_distance > new_distance));
>
>  	/*
>  	 * We know that at least one of them is a non-tag at this point.
> @@ -69,8 +78,8 @@ static int is_better_name(struct rev_name *name,
>  	 * We are now looking at two non-tags.  Tiebreak to favor
>  	 * shorter hops.
>  	 */
> -	if (name->distance != distance)
> -		return name->distance > distance;
> +	if (name_distance != new_distance)
> +		return name_distance > new_distance;
>
>  	/* ... or tiebreak to favor older date */
>  	if (name->taggerdate != taggerdate)
> @@ -88,7 +97,7 @@ static struct rev_name *create_or_update_name(struct commit *commit,
>  	struct rev_name *name = commit_rev_name_at(&rev_names, commit);
>
>  	if (is_valid_rev_name(name)) {
> -		if (!is_better_name(name, taggerdate, distance, from_tag))
> +		if (!is_better_name(name, taggerdate, generation, distance, from_tag))
>  			return NULL;
>
>  		/*
>
> base-commit: 9d530dc0024503ab4218fe6c4395b8a0aa245478
> --
> gitgitgadget
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux