On 2/2/2021 8:48 PM, Jonathan Nieder wrote: > Hi, > > Derrick Stolee wrote: >> On 2/2/2021 8:08 PM, Jonathan Nieder wrote: > >>> At Google, we're running into a commit-graph issue that appears to >>> have also arrived as part of this last week's rollout. This one is a >>> bit worse --- it is reproducible for affected users and stops them >>> from being able to do day-to-day development: >> >> You're shipping 'next' widely? I appreciate the extra eyes on >> early bits, so we can find more issues and get them resolved. > > Yes. Changes in 'next' have already gotten all the vetting via code > review that they're going to get; the difference between changes in > 'next' and 'master' is that the latter have had some production > exposure among users of 'next' with the ability to get help from a > local expert, roll back quickly when there's a problem, and so on. I > recommend that anyone with an installation with that ability use > 'next', to improve the quality of code that ultimately is released > from 'master'. > > It also helps us get the chance to use our experience to affect the > direction of a topic before it's too late. This is a good practice. It's also how I found the issues fixed in this series, but that's because I install it locally for my own extra additional testing before shipping it to users. > [...] >>> We have some examples of repositories that were corrupted this way, >>> but we didn't catch them in the act of corruption --- it started >>> happening to several users with this release so we immediately rolled >>> back. >> >> It is definitely related to the split commit-graph during the >> upgrade scenario. Your verify output shows that you are using >> the --split option heavily (possibly with fetch.writeCommitGraph? >> or are you using 'git maintenance run --task=commit-graph'?) > > Yep, the splits come from fetch.writeCommitGraph. > > [...] >>> - what is the recommended way to recover from this state? "git fsck" >>> shows the repositories to have no problems. "git help commit-graph" >>> doesn't show a command for users to use; is >>> `rm -fr .git/objects/info/commit-graphs/` the recommended recovery >>> command? >> >> That, followed by `git commit-graph write --reachable [--changed-paths]` >> depending on what they want. > > Can we package this as something more user-friendly? E.g. > > git commit-graph clear > git commit-graph write --reachable > > If that makes sense to you, I'm happy to send a patch (or to review > one if someone else gets to it first). I'm mostly asking to find out > whether this matches your idea of what the UI should be like. 'clear' is probably fine. I was thinking it might be good to have an option to the 'write' subcommand to clear the existing data, but it's probably better as separate steps. >>> - is there configuration or a patch we can roll out to help affected >>> users recover from this state? >> >> If you are willing, then take v2 of this series and follow through by >> clearing the commit-graph files of affected users. Note that you can >> be proactive using `git commit-graph verify` to see who needs rewrites. > > Does this mean we should change the BUG error message to help affected > users discover how they can recover for themselves (for example, using > commands like the above)? It _is_ a bug that led to this, but it's more about incorrect commit-graph data which could be caused by anything. Better to have a better message such as "your commit-graph file is probably corrupt". > Also, should "git fsck" call "git commit-graph verify" to make the > latter more discoverable? Yes. I thought it did, but I must be incorrect. Thanks, -Stolee