Re: history damage in linux.git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 21, 2016 at 10:08 AM, Jeff King <peff@xxxxxxxx> wrote:
>
> Right, because it makes the names longer. We give the second-parent
> traversal a heuristic cost. If we drop that cost to "1", like:

So I dropped it to 500 (removed the two last digits), and it gave a
reasonable answer. With 1000, it gave the same "based on 4.6" answer
as the current 65536 value does.

> which is technically true, but kind of painful to read. It may be that a
> reasonable weight is somewhere between "1" and "65535", though.

Based on my tests, the "right" number is somewhere in the 500-1000
range for this particular case. But it's still a completely made up
number.

> However, I think the more fundamental confusion with git-describe is
> that people expect the shortest distance to be the "first" tag that
> contained the commit, and that is clearly not true in a branchy history.

Yeah.

And I don't think people care *too* much, because I'm sure this has
happened before, it's just that before when it happened it wasn't
quite _so_ far off the expected path..

> I actually think most people would be happy with an algorithm more like:
>
>   1. Find the "oldest" tag (either by timestamp, or by version-sorting
>      the tags) that contains the commit in question.

Yes, we might want to base the "distance" at least partly on the age
of the base commits.

>   2. Find the "simplest" path from that tag to the commit, where we
>      are striving mostly for shortness of explanation, not of path (so
>      "~500" is way better than "~20^2~30^2~14", even though the latter
>      is technically a shorter path).

Well, so the three different paths I've seen are:

 - standard git (65536), and 1000:
   aed06b9 tags/v4.6-rc1~9^2~792

 - non-first-parent cost: 500:
   aed06b9 tags/v3.13-rc7~9^2~14^2~42

 - non-first parent cost: 1:
   aed06b9 tags/v3.13~5^2~4^2~2^2~1^2~42

so there clearly are multiple valid answers.

I would actually claim that the middle one is the best one - but I
claim that based on your algorithm case #1. The last one may be the
shortest actual path, but it's a shorter path to a newer tag that is a
superset of the older tag, so the middle one is actually not just
better based on age, but is a better choice based on "minimal actual
history".

               Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]