Re: [PATCH] object_as_type: initialize commit-graph-related fields of 'struct commit'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 27, 2019 at 02:08:32PM +0100, SZEDER Gábor wrote:
> When the commit graph and generation numbers were introduced in
> commits 177722b344 (commit: integrate commit graph with commit
> parsing, 2018-04-10) and 83073cc994 (commit: add generation number to
> struct commit, 2018-04-25), they tried to make sure that the
> corresponding 'graph_pos' and 'generation' fields of 'struct commit'
> are initialized conservatively, as if the commit were not included in
> the commit-graph file.
> 
> Alas, initializing those fields only in alloc_commit_node() missed the
> case when an object that happens to be a commit is first looked up via
> lookup_unknown_object(), and is then later converted to a 'struct
> commit' via the object_as_type() helper function (either calling it
> directly, or as part of a subsequent lookup_commit() call).
> Consequently, both of those fields incorrectly remain set to zero,
> which means e.g. that the commit is present in and is the first entry
> of the commit-graph file.  This will result in wrong timestamp, parent
> and root tree hashes, if such a 'struct commit' instance is later
> filled from the commit-graph.
> 
> Extract the initialization of 'struct commit's fields from
> alloc_commit_node() into a helper function, and call it from
> object_as_type() as well, to make sure that it properly initializes
> the two commit-graph-related fields, too.  With this helper function
> it is hopefully less likely that any new fields added to 'struct
> commit' in the future would remain uninitialized.
> 
> With this change alloc_commit_index() won't have any remaining callers
> outside of 'alloc.c', so mark it as static.
> 
> Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>
> ---
> 
> So, it turns out that ec0c5798ee (revision: use commit graph in
> get_reference(), 2018-12-04) is not the culprit after all, it merely
> highlighted a bug that is as old as the commit-graph feature itself.
> This patch fixes this and all other related issues I reported
> upthread.

And how/why does this affect 'git describe --dirty'?

  - 'git describe' first iterates over all refs, and somewhere deep
    inside for_each_ref() each commit (well, object) a ref points to
    is looked up via lookup_unknown_object().  This leaves all fields
    of the created object zero initialized.

  - Then it dereferences HEAD for '--dirty' and ec0c5798ee's changes
    to get_reference() kick in: lookup_commit() doesn't instantiate a
    brand new and freshly initialized 'struct commit', but returns the
    object created in the previous step converted into 'struct
    commit'.  This conversion doesn't set the commit-graph fields in
    'struct commit', but leaves both as zero.  get_reference() then
    tries to load HEAD's commit information from the commit-graph,
    find_commit_in_graph() sees the the still zero 'graph_pos' field
    and doesn't perform a search through the commit-graph file, and
    the subsequent fill_commit_in_graph() reads the commit info from
    the first entry.

    In case of the failing test I posted earlier, where only the first
    commit is in the commit-graph but HEAD isn't, this means that the
    HEAD's 'struct commit' is filled with the info of HEAD^.

  - Ultimately, the diff machinery then doesn't compare the worktree
    to HEAD's tree, but to HEAD^'s, finds that they differ, hence the
    incorrect '-dirty' flag in the output.

Before ec0c5798ee get_reference() simply called parse_object(), which
ignored the commit-graph, so the issue could remain hidden.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux