On Thu, Apr 21, 2016 at 10:23 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > I think avoiding side branches to describe with the weight is a > right thing to do, i.e. if you have this history: > > X---o---o---o---o---v4.6 > \ / > o-----------o > > you do not want to explain X as "v4.6~^2~2", and instead you want it > as "v4.6~5", even though the former is 4 hops while the latter is 5 > hops (which is longer). Yes. I have a new suggestion: make the "merge weight" depend on how complex the name ends up being. And that algorithm actually gives a completely new and improved path: aed06b9 tags/v3.13-rc2~32^2^2~47 which is still complex, but is actually the best one I've found so far. To compare against the previous ones I looked at: v4.6-rc1~9^2~792 <- current git code v3.13-rc2~32^2^2~47 <- new with attached patch v3.13-rc7~9^2~14^2~42 <- merge weight 500 v3.13~5^2~4^2~2^2~1^2~42 <- merge weight 1 that new one is actually the second-most dense, and uses the oldest tag. So it's a good combination of denseness, but still gets the best actual base tag. The logic of the patch is that there are different "complexities" in the naming, and it's not just whether you are following a second parent, it's also if you're doing generational hops. So when you do a first-parent change, the name stays simple (the last number changes), and that means that the "distance" weight is low (so that's the normal "+1" we do for first-parent. But if it's not a first parent, there are two different cases: - generation > 0: we add "~%d^%d", and we approximate that with "four characters", and use a cost that is four orders of magnitude higher (so 10000). - generation == 0: we add just "^%d", so generally just two characters. So use a cost that is just two orders of magnitude higher (so 100). In other words, I'm trying to convince people that my patch not only gives a good result, but that the "weight numbers" I use make some kind of conceptual sense from a complexity cost angle. With that, the patch itself is attached. I think it's better than what "git name-rev" does now. Is it optimal? No, I think the *optimal* would use some kind of "does one tag contain the other" logic, and discarding all base names that are just supersets of another base that still reaches the target. But this patch is small and simple, and has some excuses for its behavior. What do people think? Linus
builtin/name-rev.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/builtin/name-rev.c b/builtin/name-rev.c index 092e03c3cc9b..0354c8d222e1 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -16,9 +16,6 @@ typedef struct rev_name { static long cutoff = LONG_MAX; -/* How many generations are maximally preferred over _one_ merge traversal? */ -#define MERGE_TRAVERSAL_WEIGHT 65535 - static void name_rev(struct commit *commit, const char *tip_name, int generation, int distance, int deref) @@ -55,19 +52,26 @@ copy_data: parents; parents = parents->next, parent_number++) { if (parent_number > 1) { + int weight; size_t len; char *new_name; strip_suffix(tip_name, "^0", &len); - if (generation > 0) + + // The extra merge traversal "weight" depends + // on how complex the resulting name is. + if (generation > 0) { + weight = 10000; new_name = xstrfmt("%.*s~%d^%d", (int)len, tip_name, generation, parent_number); - else + } else { + weight = 100; new_name = xstrfmt("%.*s^%d", (int)len, tip_name, parent_number); + } name_rev(parents->item, new_name, 0, - distance + MERGE_TRAVERSAL_WEIGHT, 0); + distance + weight, 0); } else { name_rev(parents->item, tip_name, generation + 1, distance + 1, 0);