On 6/9/2022 11:26 AM, Jeff King wrote: > On Thu, Jun 09, 2022 at 09:49:15AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> It's certainly interesting to see *how* we got to this state, but just >> so we're on the same page: I fundimentally don't think it matters to the >> *real* bug here. >> >> Which is that at the very least f90fca638e9 (commit-graph: consolidate >> fill_commit_graph_info, 2021-01-16) and e8b63005c48 (commit-graph: >> implement generation data chunk, 2021-01-16) (CC'd author) have a bad >> regression on earlier fixes that read-only operations of the >> commit-graph *must not die*. I.e. the "parse" and "verify" paths of the >> commit-graph.c code shouldn't call exit(), die() etc. > > Yeah, I'd agree that this is a good philosophy to follow. The > commit-graph data is meant to be an optimization, and we can always > continue without it. I agree that the die() is part of what is frustrating here, but we need to be careful: when we recognize that the commit-graph data is erroneous _at this stage_ we may have already made decisions based on the existence of a commit-graph (such as "we should trust generation numbers" or "we have parsed some of the commits using the commit-graph") and so we cannot guarantee that the process will complete with correct results from this point. >> If you replace your graph with Jeff's corrupt one and run "git status", >> "git log" etc. it's still emitting one verbose complaint, but it no >> longer does so in loops (at least for these paths, but e.g. "git gc" is >> still doing that). >> >> But it does get us to where we can run "git gc", and while complaining >> too much along the way will write out a new & valid commit graph at the >> end ("[... comments are mine"): > > Yeah, getting through "git gc" is the key thing here. Then the problem > solves itself, sometimes even automatically (via auto-gc). Perhaps we could change the die() behavior to be a warning() plus a reset of all commit data if we are in a mode that can handle that error. Specifically, during a commit-graph write. I think the current die() behavior is the safest thing to do right now until someone has time to think through these scenarios carefully. Thanks, -Stolee