Patrick Steinhardt <ps@xxxxxx> writes: > Wouldn't this have the potential to significantly regress performance > for all those preexisting users of the `--missing` option? The commit > graph is quite an important optimization nowadays, and especially in > commands where we potentially walk a lot of commits (like we may do > here) it can result in queries that are orders of magnitudes faster. The test fails only when GIT_TEST_COMMIT_GRAPH is on, which updates the commit-graph every time a commit is made via "git commit" or "git merge". I'd suggest stepping back and think a bit. My assumption has been that the failing test emulates this scenario that can happen in real life: * The user creates a new commit. * A commit graph is written (not as part of GIT_TEST_COMMIT_GRAPH that is not realistic, but as part of "maintenance"). * The repository loses some objects due to corruption. * Now, "--missing=print" is invoked so that the user can view what are missing. Or "--missing=allow-primisor" to ensure that the repository does not have missing objects other than the ones that the promisor will give us if we asked again. * But because the connectivity of these objects appear in the commit graph file, we fail to notice that these objects are missing, producing wrong results. If we disabled commit-graph while traversal (an earlier writing of it was perfectly OK), then "rev-list --missing" would have noticed and reported what the user wanted to know. In other words, the "optimization" you value is working to quickly produce a wrong result. Is it "significantly regress"ing if we disabled it to obtain the correct result? My assumption also has been that there is no point in running "rev-list --missing" if we know there is no repository corruption, and those who run "rev-list --missing" wants to know if the objects are really available, i.e. even if commit-graph that is out of sync with reality says it exists, if it is not in the object store, they would want to know that. If you can show me that it is not the case, then I may be pursuaded why producing a result that is out of sync with reality _quickly_, instead of taking time to produce a result that matches reality, is a worthy "optimization" to keep. Thanks.