Re: [PATCH v2 2/6] commit-graph: always parse before commit_graph_data_at()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/2/2021 8:08 PM, Jonathan Nieder wrote:
> Derrick Stolee wrote:
> 
>> There is a subtle failure happening when computing corrected commit
>> dates with --split enabled. It requires a base layer needing the
>> generation_data_overflow chunk. Then, the next layer on top
>> erroneously thinks it needs an overflow chunk due to a bug leading
>> to recalculating all reachable generation numbers. The output of
>> the failure is
>>
>>   BUG: commit-graph.c:1912: expected to write 8 bytes to
>>   chunk 47444f56, but wrote 0 instead
> 
> At Google, we're running into a commit-graph issue that appears to
> have also arrived as part of this last week's rollout.  This one is a
> bit worse --- it is reproducible for affected users and stops them
> from being able to do day-to-day development:

You're shipping 'next' widely? I appreciate the extra eyes on
early bits, so we can find more issues and get them resolved.

>   $ git pull
>   remote: Finding sources: 100% (33/33)
>   remote: Total 33 (delta 18), reused 33 (delta 18)
>   Unpacking objects: 100% (33/33), 27.44 KiB | 460.00 KiB/s, done.
>   From https://example.com/path/to/repo
>      05ba0d775..279e4e6d0  master     -> origin/master
>   BUG: commit-reach.c:64: bad generation skip     29e3 >      628 at 62abdabd1be00ebadbf73061ecf72b35042338e3
>   error: merge died of signal 6
> 
> "git commit-graph verify" agrees that the generation numbers are wrong:
> 
>   $ git commit-graph verify
>   commit-graph generation for commit 4290b2214cdf50263118322735347d151715a272 is 3 != 1586
>   Verifying commits in commit graph: 100% (1/1), done.
>   commit-graph generation for commit b6c73a8472c7cb503cce0668849150a4b4329230 is 1576 != 10724
>   Verifying commits in commit graph: 100% (10/10), done.
>   Verifying commits in commit graph: 100% (88/88), done.
>   Verifying commits in commit graph: 100% (208/208), done.
>   Verifying commits in commit graph: 100% (592/592), done.
>   Verifying commits in commit graph: 100% (1567/1567), done.
>   Verifying commits in commit graph: 100% (8358/8358), done.
> 
> We have some examples of repositories that were corrupted this way,
> but we didn't catch them in the act of corruption --- it started
> happening to several users with this release so we immediately rolled
> back.

It is definitely related to the split commit-graph during the
upgrade scenario. Your verify output shows that you are using
the --split option heavily (possibly with fetch.writeCommitGraph?
or are you using 'git maintenance run --task=commit-graph'?)

> Questions:
> 
> - is this likely to be due to the same cause, or is it orthogonal?

My guess is that the reason is the same. I think that you might
have some strangeness of a commit-graph layer with corrected commit
dates being below a commit-graph layer without it.

> - what is the recommended way to recover from this state?  "git fsck"
>   shows the repositories to have no problems.  "git help commit-graph"
>   doesn't show a command for users to use; is
>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>   command?

That, followed by `git commit-graph write --reachable [--changed-paths]`
depending on what they want.

> - is there configuration or a patch we can roll out to help affected
>   users recover from this state?

If you are willing, then take v2 of this series and follow through by
clearing the commit-graph files of affected users. Note that you can
be proactive using `git commit-graph verify` to see who needs rewrites.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux