On Thu, Apr 21, 2016 at 10:23 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Thu, Apr 21, 2016 at 10:08 AM, Jeff King <peff@xxxxxxxx> wrote: >> >> Right, because it makes the names longer. We give the second-parent >> traversal a heuristic cost. If we drop that cost to "1", like: > > So I dropped it to 500 (removed the two last digits), and it gave a > reasonable answer. With 1000, it gave the same "based on 4.6" answer > as the current 65536 value does. > >> which is technically true, but kind of painful to read. It may be that a >> reasonable weight is somewhere between "1" and "65535", though. > > Based on my tests, the "right" number is somewhere in the 500-1000 > range for this particular case. But it's still a completely made up > number. > >> However, I think the more fundamental confusion with git-describe is >> that people expect the shortest distance to be the "first" tag that >> contained the commit, and that is clearly not true in a branchy history. > > Yeah. > > And I don't think people care *too* much, because I'm sure this has > happened before, it's just that before when it happened it wasn't > quite _so_ far off the expected path.. > >> I actually think most people would be happy with an algorithm more like: >> >> 1. Find the "oldest" tag (either by timestamp, or by version-sorting >> the tags) that contains the commit in question. > > Yes, we might want to base the "distance" at least partly on the age > of the base commits. > >> 2. Find the "simplest" path from that tag to the commit, where we >> are striving mostly for shortness of explanation, not of path (so >> "~500" is way better than "~20^2~30^2~14", even though the latter >> is technically a shorter path). > > Well, so the three different paths I've seen are: > > - standard git (65536), and 1000: > aed06b9 tags/v4.6-rc1~9^2~792 > > - non-first-parent cost: 500: > aed06b9 tags/v3.13-rc7~9^2~14^2~42 > > - non-first parent cost: 1: > aed06b9 tags/v3.13~5^2~4^2~2^2~1^2~42 > > so there clearly are multiple valid answers. > > I would actually claim that the middle one is the best one - but I > claim that based on your algorithm case #1. The last one may be the > shortest actual path, but it's a shorter path to a newer tag that is a > superset of the older tag, so the middle one is actually not just > better based on age, but is a better choice based on "minimal actual > history". > > Linus Combining Junios and Linus idea: * We want to have the minimal history, i.e. that tag with the fewest cummulative parent commits. (i.e. v3.13-rc7 is better than v3.13 because `git log --oneline v3.13-rc7 |wc -l` (414317) is smaller tha `git log --oneline v3.13 |wc -l` (414530). The difference is 213. tags/v3.13-rc7~9^2~14^2~42 has 9 + 14 + 42 additional steps (65) tags/v3.13~5^2~4^2~2^2~1^2~42 has 5 + 4 + 2 + 1 +42 steps (54) tags/v3.13~5^2~4^2~2^2~1^2~42 has 9 less steps, but its base tag has a higher weight by 213. v4.6-rc1 has even more weight (588477). So I guess what I propose is to take the weight of a tag into account via `git log --oneline <tag> |wc -l` as that gives the tag which encloses least history? We also do not want to have "a lot of side traversals", so we could punish each additional addendum by a heuristic. > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html