On Thu, Oct 19, 2023 at 10:16:56AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > > > There's another way to handle this, which is to conditionally enable the > > object existence check. This would be less of a performance hit compared > > to disabling commit graphs altogether with `--missing`, but still ensure > > that repository corruption was detected. Second, it would not regress > > performance for all preexisting users of `repo_parse_commit_gently()`. > > The above was what I meant to suggest when you demonstrated that the > code with additional check is still much more performant than > running without the commit-graph optimization, yet has observable > impact on performance for normal codepaths that do not need the > extra carefulness. > > But I wasn't sure if it is easy to plumb the "do we want to double > check? in other words, are we running something like --missing that > care the correctness a bit more than usual cases?" bit down from the > caller, because this check is so deep in the callchain. I wonder if we would want a "be extra careful" flag that is read from the environment? That is largely how GIT_REF_PARANOIA works, and then particular operations set it (though actually it is the default these days, so they no longer do so explicitly). I guess that is really like a global variable but even more gross. ;) But it is nice that it can cross process boundaries, because "how careful do we want to be" may be in the eye of the caller, especially for plumbing commands. E.g., even without --missing, you may want "rev-list" to be extra careful about checking the existence of commits. The caller in check_connected() could arguably turn that on by default (the way we used to turn on GIT_REF_PARANOIA for pruning repacks before it became the default). -Peff