Re: Possible git bug in commit-graph: "invalid commit position"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/21/2020 4:45 PM, Michael Forney wrote:
> On 2020-05-15, Alex Riesen <alexander.riesen@xxxxxxxxxxx> wrote:
>> Gary Oberbrunner, Tue, Feb 04, 2020 23:33:42 +0100:
>>> Sorry for the long reply delay; the bug went away and only just showed
>>> up again. Here's the info you requested.
>>> I'm now running git 2.25.0.
>>
>> I hit a very similar problem today with 2.26.0. Also in a submodule.
>>
>> Removing and regenerating the commit graph did not help and I did not have
>> the
>> commit-graphs directory (only a file). "git commit-graph verify" does not
>> find
>> anything. Switching writeCommitGraph on and regenerating the commit graph
>> makes no difference.
>>
>> I can trigger it reliably by visiting the broken(?) commit in supermodule
>> with:
>>
>>     git show --submodule=log <commit>
>>
>> I see nothing special in the commit invovled. It is just a single commit in
>> my
>> case, and the commit is a merge of two branches.
> 
> I hit this bug a while back, and it went away after I deleted the
> commit-graph in the submodule and regenerated it (IIRC).
> 
> I just ran into it again (on 2.27.0), and this time, I did some digging.
> 
> I have a repository containing a number of submodules, and the bug
> appeared after I updated one of the submodules, and then looked at
> `git log -p` with diff.submodule = log. Just like Alex, I can reliably
> trigger the error with `git show --submodule=log <commit>`.
> 
> I rebuilt git with some print statements to try to see what's going
> on, and got the following:
> 
> 	/src/oasis/.git/modules/pkg/file/src c81d1ccbf4c224af50e6d556419961dba72666c7
> 		pos: 4986, num_commits: 6452, num_commits_in_base: 0
> 	/src/oasis/.git/modules/pkg/file/src 9f2f793847c6aeab9501287b6847dc842c84630f
> 		pos: 3964, num_commits: 6452, num_commits_in_base: 0
> 	/src/oasis/.git/modules/pkg/file/src fd7eb1f793944635b92bfa56a84a4dc1dbefb119
> 		pos: 6383, num_commits: 6452, num_commits_in_base: 0
> 	/src/oasis/.git/modules/pkg/file/src d955cefc956ba537cfc0556023a65fe80bd2d82b
> 		pos: 5436, num_commits: 6452, num_commits_in_base: 0
> 	/src/oasis/.git/modules/pkg/file/src 0c79c693d6a86f7ad7ada2a9a1eb3bdf483f77cc
> 		pos: 301, num_commits: 6452, num_commits_in_base: 0
> 	.git fa09b87efa9b9664e4e53ab768cfa5f51a6c6fa2
> 		pos: 6292, num_commits: 5177, num_commits_in_base: 0
> 	fatal: invalid commit position. commit-graph is likely corrupt
> 
> Using `git commit-graph verify`, I confirmed that the main
> repository's commit graph contains 5177 commits, and the submodule
> repository's commit-graph contains 6452 commits. Commit fa09b8 is part
> of the submodule, not the main repository, so it makes sense that it
> is an invalid commit for the main repositories commit-graph.
> 
> So, this seems a little fishy. fill_commit_in_graph is getting called
> with the main repository and a commit belonging to the submodule.
> Looking through the call stack in gdb, I see that the initial calls to
> fill_commit_in_graph come from show_submodule_header, which computes
> left, right, and merge_bases. Then, those commits are passed to
> prepare_submodule_summary, but this function does *not* accept a
> submodule parameter. prepare_submodule_summary calls
> repo_init_revisions with the_repository, which seems to be the source
> of the problem. I think it should be using the submodule repository
> instead.
> 
> I changed prepare_submodule_summary to accept a repository and to use
> that instead, but the issue persisted. Digging deeper, this is because
> revision.c:process_parents uses parse_commit_gently, which is a
> synonym for repo_parse_commit_gently(the_repository, ...). I changed
> it to use repo_parse_commit_gently(revs->repo, ...), and this time,
> the problem went away.
> 
> I'm not very familiar with the git codebase, but am I on the right
> track here? I also noticed a number of other calls to
> parse_commit_gently in revision.c, and I think those should pass
> revs->repo as well. Does that sound right? If so, I can send a patch
> to fix these issues.

This is some good digging, and I think you are absolutely correct
with the root cause. The dependence on the_repository is still
something that is being worked on in the Git codebase (but less
frequently lately) and trips up submodule things like this.

I think a simple method swap would be a good patch to send, and
you can include many of the details above in the commit message.
Is that a contribution you have time to make? I'll gladly review
it.

Thanks,
-Stolee






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux