[BUG] `git describe` doesn't traverse the graph in topological order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I found an issue where `git describe` doesn't find a "closer" tag than
another tag as the correct one to base the description off of. I have a
reproducer, but I'll first give details of the real world issue.

Repository: https://gitlab.kitware.com/vtk/vtk.git
`master` as of: dedf87b3a1b7e5be5d8cdb46b37ad3030590b8ac

$ git rev-parse HEAD
dedf87b3a1b7e5be5d8cdb46b37ad3030590b8ac
$ git rev-parse HEAD~2^2
da2482f716310fc59ac4be42ce977f6badc6af95
$ git rev-prase v9.3.0.rc1
f150d52568f4e00aa9c8b1568a521a08ded8d4cb
$ git rev-prase v9.3.0.rc1^{commit}
da2482f716310fc59ac4be42ce977f6badc6af95
$ git rev-parse HEAD~2^2~2
0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5
$ git rev-prase v9.3.0.rc0
e5e13b14629d445bf65d5f8a181920ed9b97d54c
$ git rev-prase v9.3.0.rc0^{commit}
0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5
$ git describe HEAD
v9.3.0.rc0-56-gdedf87b3a1
$ git describe --matches v9.3.0.rc1 HEAD
v9.3.0.rc1-86876-gdedf87b3a1

As you can see:

- v9.3.0.rc1 is "closer" to `HEAD` than v9.3.0.rc0 (created as a
  workaround for this bug; v9.2.6 is otherwise reported)
- v9.3.0.rc0 is an ancestor of v9.3.0.rc1
- Both v9.3.0.rc0 and v9.3.0.rc1 are ancestors of `HEAD`
- `git describe` reports that `HEAD` is "closest" to v9.3.0.rc0
- Forcing the issue and asking for v9.3.0.rc1 shows that it thinks there
  are almost 87000 commits somehow not on that commit.

I have a reproducer script attached. It reproduces back to 2.9.0 and
probably before. 2.8.0 didn't support the structure hiding that newer
OpenSSL 1.1 has done and given that it's at least that old, I don't
think it matters too much for backporting or anything like that.

I instrumented `git describe` with some `printf` debugging (diff
attached) and found out that the commit traversal is not happening in
topological order. I suspect that this is the root cause of the issue:

    looking at commit 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5
    depth of 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5: 16
    find order of 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5: 2
    setting flag 4 for commit 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5
    flag for 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5: 5
    pushing depth of da2482f716310fc59ac4be42ce977f6badc6af95 because of 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5 (flag_within): 16
    setting flag 5 for commit d478d2e22e81ee2602035fe1d731b402d9b4eda7 due to ancestry

    looking at commit 1c3d839dac92761ae0866e23d89bdc8ee690de08
    depth of 1c3d839dac92761ae0866e23d89bdc8ee690de08: 17
    find order of 1c3d839dac92761ae0866e23d89bdc8ee690de08: 3
    setting flag 8 for commit 1c3d839dac92761ae0866e23d89bdc8ee690de08
    flag for 1c3d839dac92761ae0866e23d89bdc8ee690de08: b
    pushing depth of 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5 because of 1c3d839dac92761ae0866e23d89bdc8ee690de08 (flag_within): 17
    setting flag f for commit 0a77d7cf4fdbf489ee5d38c6fec6517574cdaeb5 due to ancestry

It looks at 0a77d7cf4 before it looks at 1c3d839da (which is
v9.3.0.rc1~), but 1c3d839da~ *is* 0a77d7cf4. Because 0a77d7cf4 has
already passed on its presence flags to its parent(s), the update
performed when processing 1c3d839da has no effect. Therefore the
"entire" history is not seen as being reachable from da2482f71 and it
ends up not being the best match.

I will note that the authorship date of 1c3d839da is before that of
either 0a77d7cf4 or da2482f71 (due to a rebase that reordered the
commits to keep 1c3d839da on the release-only part of the branch), but
the reproducer script doesn't seem to care that much.

I suspect that building of the `commit_list` is the problem, probably by
using `commit_list_insert_by_date` instead of by topological sorting.
The reproducer script doesn't do anything (AFAICT) sneaky with dates
(e.g., rebasing and such) though, so I'm nowhere near 100% confident
about that.

Perhaps commits should be re-scheduled if their `flags` get updated
based on a newly discovered ancestor while traversing? I suspect that
depth tracking becomes more complicated in that case though because the
second pass on 0a77d7cf4 needs to subtract a depth from the relevant
tags with the new flag value. But it'd find the right tag at least…

Thanks,

--Ben

Attachment: repro-git-describe.sh
Description: Bourne shell script

diff --git a/builtin/describe.c b/builtin/describe.c
index b28a4a1f82..5895d1af3a 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -264,8 +264,10 @@ static unsigned long finish_depth_computation(
 			}
 			if (!a)
 				break;
-		} else
+		} else {
 			best->depth++;
+			fprintf(stderr, "pushing depth of %s (finish_depth_computation): %d\n", oid_to_hex(&c->object.oid), best->depth);
+		}
 		while (parents) {
 			struct commit *p = parents->item;
 			repo_parse_commit(the_repository, p);
@@ -363,19 +365,24 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		struct commit_list *parents = c->parents;
 		struct commit_name **slot;
 
+		fprintf(stderr, "\n\nlooking at commit %s\n", oid_to_hex(&c->object.oid));
 		seen_commits++;
 		slot = commit_names_peek(&commit_names, c);
 		n = slot ? *slot : NULL;
 		if (n) {
 			if (!tags && !all && n->prio < 2) {
+				fprintf(stderr, "skipping unannotated tag %s\n", oid_to_hex(&c->object.oid));
 				unannotated_cnt++;
 			} else if (match_cnt < max_candidates) {
 				struct possible_tag *t = &all_matches[match_cnt++];
 				t->name = n;
 				t->depth = seen_commits - 1;
+				fprintf(stderr, "depth of %s: %d\n", oid_to_hex(&c->object.oid), t->depth);
 				t->flag_within = 1u << match_cnt;
 				t->found_order = match_cnt;
+				fprintf(stderr, "find order of %s: %d\n", oid_to_hex(&c->object.oid), t->found_order);
 				c->object.flags |= t->flag_within;
+				fprintf(stderr, "setting flag %x for commit %s\n", t->flag_within, oid_to_hex(&c->object.oid));
 				if (n->prio == 2)
 					annotated_cnt++;
 			}
@@ -386,11 +393,15 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		}
 		for (cur_match = 0; cur_match < match_cnt; cur_match++) {
 			struct possible_tag *t = &all_matches[cur_match];
-			if (!(c->object.flags & t->flag_within))
+			if (!(c->object.flags & t->flag_within)) {
 				t->depth++;
+				fprintf(stderr, "flag for %s: %x\n", oid_to_hex(&c->object.oid), c->object.flags);
+				fprintf(stderr, "pushing depth of %s because of %s (flag_within): %d\n", oid_to_hex(&t->name->peeled), oid_to_hex(&c->object.oid), t->depth);
+			}
 		}
 		/* Stop if last remaining path already covered by best candidate(s) */
 		if (annotated_cnt && !list) {
+			fprintf(stderr, "checking for best candidate\n");
 			int best_depth = INT_MAX;
 			unsigned best_within = 0;
 			for (cur_match = 0; cur_match < match_cnt; cur_match++) {
@@ -415,6 +426,7 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 			if (!(p->object.flags & SEEN))
 				commit_list_insert_by_date(p, &list);
 			p->object.flags |= c->object.flags;
+			fprintf(stderr, "setting flag %x for commit %s due to ancestry\n", p->object.flags, oid_to_hex(&p->object.oid));
 			parents = parents->next;
 
 			if (first_parent)

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux