[cc'ing Shawn, who probably doesn't remember at all how git-describe works these days, but it's worth a shot] On Wed, Nov 10, 2010 at 02:00:16AM +0100, Miklos Vajna wrote: > The frugalware project (git repo: > http://frugalware.org/git/pub/frugalware/frugalware-current ) has the > two latest tags 1.3 and 1.4pre1, somehow git describe HEAD now mentions > 1.3 and not 1.4pre1 in the output. > > To be more exact: > > $ git rev-parse --short HEAD > f25435f > > $ git rev-list 1.4pre1..|wc -l > 871 > > So I would exepct 1.4pre1-871-gf25435f. In fact, the output is: > > $ git describe > 1.3-3028-gf25435f > > Or in case I force the usage of the latest tag: > > $ git describe --candidate 1 > 1.4pre1-64725-gf25435f > > Does anyone have an idea what's going wrong here? The describe code is pretty inscrutable. But what it should be doing is roughly: 1. Describe all interesting refs (annotated tags by default) by marking the commit object's util field. 2. Walk backwards in the commit tree. As we walk backwards, keep track of how far (in commits) we have gone. When we see a tag, mark it with how far we've gone. The trick is in keeping track of how far we've gone. It looks like we keep the number of seen_commits, increment each time we traverse a commit, and then assign that to the "depth" field. But I don't see how that can be right. We are traversing in a breadth-first manner, so we may look at 1000 commits down one ancestry chain of a merge before following the first parent on another. But when we finally follow the second parent, our seen_commits is much higher than the actual distance we had to travel to get here. Looking at the frugalware history in gitk, you might be triggering this; you have a history with just a few merges, but extremely long chains of commits on each branch. Repos like linux or git.git are a bit bushier in the shape of the graph. But there's more going on there that I don't quite understand. There is some kind of magic "within" flag being kept for each tag we find. It's a little too late at night for me to wrap my head around it tonight. But I don't think it changes the fundamental issue that our depth value is just an estimate; its accuracy works under the assumption that all sides of a branch are about equal. It seems to me the "right" way to do it would be to track the depth to get to each commit. We can do this efficiently by just keeping a depth marker for each commit we push onto the commit_list queue. When we visit each commit, we know its depth, and when we push on its parents, they get its depth+1. The patch below implements that in a very rough-and-dirty way. It does find the 1.4 tag in your repository that you expect. However: 1. This still isn't quite accurate. We might visit a commit by two different paths. When we reach it by a shorter path, we need some way of saying "oops, the depth on any tags we found following this path are going to be too long. We need to go back and shorten them". 2. I am getting nonsensical results when trying it in git.git. It really wants to point me to gitgui tags, which makes no sense. So clearly there is a bug, or my idea is flawed somehow. But it's too late to think about anymore tonight. :) -Peff PS This would be a much simpler algorithm to write in a depth-first way. But that would also involve traversing the entire graph down to the roots, which we try to avoid. Which reminds me of my "tag --contains" depth first algorithm, and gives me some ideas on how to make it work in a breadth-first way. So even if my idea here is flawed, this thinking hasn't been completely fruitless. :) -- >8 -- diff --git a/builtin/describe.c b/builtin/describe.c index 43caff2..31cf855 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -164,39 +164,6 @@ static int compare_pt(const void *a_, const void *b_) return 0; } -static unsigned long finish_depth_computation( - struct commit_list **list, - struct possible_tag *best) -{ - unsigned long seen_commits = 0; - while (*list) { - struct commit *c = pop_commit(list); - struct commit_list *parents = c->parents; - seen_commits++; - if (c->object.flags & best->flag_within) { - struct commit_list *a = *list; - while (a) { - struct commit *i = a->item; - if (!(i->object.flags & best->flag_within)) - break; - a = a->next; - } - if (!a) - break; - } else - best->depth++; - while (parents) { - struct commit *p = parents->item; - parse_commit(p); - if (!(p->object.flags & SEEN)) - insert_by_date(p, list); - p->object.flags |= c->object.flags; - parents = parents->next; - } - } - return seen_commits; -} - static void display_name(struct commit_name *n) { if (n->prio == 2 && !n->tag) { @@ -231,7 +198,6 @@ static void describe(const char *arg, int last_one) struct commit_name *n; struct possible_tag all_matches[MAX_TAGS]; unsigned int match_cnt = 0, annotated_cnt = 0, cur_match; - unsigned long seen_commits = 0; unsigned int unannotated_cnt = 0; if (get_sha1(arg, sha1)) @@ -262,10 +228,11 @@ static void describe(const char *arg, int last_one) list = NULL; cmit->object.flags = SEEN; commit_list_insert(cmit, &list); + list->depth = 0; while (list) { + unsigned long depth = list->depth; struct commit *c = pop_commit(&list); struct commit_list *parents = c->parents; - seen_commits++; n = c->util; if (n) { if (!tags && !all && n->prio < 2) { @@ -273,7 +240,7 @@ static void describe(const char *arg, int last_one) } else if (match_cnt < max_candidates) { struct possible_tag *t = &all_matches[match_cnt++]; t->name = n; - t->depth = seen_commits - 1; + t->depth = depth; t->flag_within = 1u << match_cnt; t->found_order = match_cnt; c->object.flags |= t->flag_within; @@ -285,11 +252,6 @@ static void describe(const char *arg, int last_one) break; } } - for (cur_match = 0; cur_match < match_cnt; cur_match++) { - struct possible_tag *t = &all_matches[cur_match]; - if (!(c->object.flags & t->flag_within)) - t->depth++; - } if (annotated_cnt && !list) { if (debug) fprintf(stderr, "finished search at %s\n", @@ -299,8 +261,10 @@ static void describe(const char *arg, int last_one) while (parents) { struct commit *p = parents->item; parse_commit(p); - if (!(p->object.flags & SEEN)) - insert_by_date(p, &list); + if (!(p->object.flags & SEEN)) { + struct commit_list *cl = insert_by_date(p, &list); + cl->depth = depth + 1; + } p->object.flags |= c->object.flags; parents = parents->next; } @@ -329,9 +293,7 @@ static void describe(const char *arg, int last_one) if (gave_up_on) { insert_by_date(gave_up_on, &list); - seen_commits--; } - seen_commits += finish_depth_computation(&list, &all_matches[0]); free_commit_list(list); if (debug) { @@ -341,7 +303,6 @@ static void describe(const char *arg, int last_one) prio_names[t->name->prio], t->depth, t->name->path); } - fprintf(stderr, "traversed %lu commits\n", seen_commits); if (gave_up_on) { fprintf(stderr, "more than %i tags found; listed %i most recent\n" diff --git a/commit.h b/commit.h index a3618f8..6493606 100644 --- a/commit.h +++ b/commit.h @@ -8,6 +8,7 @@ struct commit_list { struct commit *item; + unsigned long depth; struct commit_list *next; }; -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html