On Sun, Jan 27, 2019 at 02:08:32PM +0100, SZEDER Gábor wrote: > When the commit graph and generation numbers were introduced in > commits 177722b344 (commit: integrate commit graph with commit > parsing, 2018-04-10) and 83073cc994 (commit: add generation number to > struct commit, 2018-04-25), they tried to make sure that the > corresponding 'graph_pos' and 'generation' fields of 'struct commit' > are initialized conservatively, as if the commit were not included in > the commit-graph file. > > Alas, initializing those fields only in alloc_commit_node() missed the > case when an object that happens to be a commit is first looked up via > lookup_unknown_object(), and is then later converted to a 'struct > commit' via the object_as_type() helper function (either calling it > directly, or as part of a subsequent lookup_commit() call). > Consequently, both of those fields incorrectly remain set to zero, > which means e.g. that the commit is present in and is the first entry > of the commit-graph file. This will result in wrong timestamp, parent > and root tree hashes, if such a 'struct commit' instance is later > filled from the commit-graph. > > Extract the initialization of 'struct commit's fields from > alloc_commit_node() into a helper function, and call it from > object_as_type() as well, to make sure that it properly initializes > the two commit-graph-related fields, too. With this helper function > it is hopefully less likely that any new fields added to 'struct > commit' in the future would remain uninitialized. > > With this change alloc_commit_index() won't have any remaining callers > outside of 'alloc.c', so mark it as static. > > Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx> > --- > > So, it turns out that ec0c5798ee (revision: use commit graph in > get_reference(), 2018-12-04) is not the culprit after all, it merely > highlighted a bug that is as old as the commit-graph feature itself. > This patch fixes this and all other related issues I reported > upthread. And how/why does this affect 'git describe --dirty'? - 'git describe' first iterates over all refs, and somewhere deep inside for_each_ref() each commit (well, object) a ref points to is looked up via lookup_unknown_object(). This leaves all fields of the created object zero initialized. - Then it dereferences HEAD for '--dirty' and ec0c5798ee's changes to get_reference() kick in: lookup_commit() doesn't instantiate a brand new and freshly initialized 'struct commit', but returns the object created in the previous step converted into 'struct commit'. This conversion doesn't set the commit-graph fields in 'struct commit', but leaves both as zero. get_reference() then tries to load HEAD's commit information from the commit-graph, find_commit_in_graph() sees the the still zero 'graph_pos' field and doesn't perform a search through the commit-graph file, and the subsequent fill_commit_in_graph() reads the commit info from the first entry. In case of the failing test I posted earlier, where only the first commit is in the commit-graph but HEAD isn't, this means that the HEAD's 'struct commit' is filled with the info of HEAD^. - Ultimately, the diff machinery then doesn't compare the worktree to HEAD's tree, but to HEAD^'s, finds that they differ, hence the incorrect '-dirty' flag in the output. Before ec0c5798ee get_reference() simply called parse_object(), which ignored the commit-graph, so the issue could remain hidden.