On Fri, Oct 20, 2023 at 06:00:24AM -0400, Jeff King wrote: > On Thu, Oct 19, 2023 at 10:16:56AM -0700, Junio C Hamano wrote: > > > Patrick Steinhardt <ps@xxxxxx> writes: > > > > > There's another way to handle this, which is to conditionally enable the > > > object existence check. This would be less of a performance hit compared > > > to disabling commit graphs altogether with `--missing`, but still ensure > > > that repository corruption was detected. Second, it would not regress > > > performance for all preexisting users of `repo_parse_commit_gently()`. > > > > The above was what I meant to suggest when you demonstrated that the > > code with additional check is still much more performant than > > running without the commit-graph optimization, yet has observable > > impact on performance for normal codepaths that do not need the > > extra carefulness. > > > > But I wasn't sure if it is easy to plumb the "do we want to double > > check? in other words, are we running something like --missing that > > care the correctness a bit more than usual cases?" bit down from the > > caller, because this check is so deep in the callchain. > > I wonder if we would want a "be extra careful" flag that is read from > the environment? That is largely how GIT_REF_PARANOIA works, and then > particular operations set it (though actually it is the default these > days, so they no longer do so explicitly). > > I guess that is really like a global variable but even more gross. ;) > But it is nice that it can cross process boundaries, because "how > careful do we want to be" may be in the eye of the caller, especially > for plumbing commands. E.g., even without --missing, you may want > "rev-list" to be extra careful about checking the existence of commits. > The caller in check_connected() could arguably turn that on by default > (the way we used to turn on GIT_REF_PARANOIA for pruning repacks before > it became the default). Good point indeed, I also started to ask myself whether we'd want to have a config option. An environment variable `GIT_OBJECT_PARANOIA` would work equally well though and be less "official". What I like about this idea is that it would also allow us to easily unify the behaviour of `lookup_commit_in_graph()` and the new sanity check in `repo_parse_commit_gently()` that I'm introducing here. I'd personally think that the default for any such toggle should be `true`, also to keep the current behaviour of `lookup_commit_in_graph()`. But changing it would be easy enough. I'll send a v2 of this series with such a toggle. Patrick
Attachment:
signature.asc
Description: PGP signature