Re: [PATCH] commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default

Patrick Steinhardt <ps@xxxxxx> · Wed, 15 Nov 2023 14:35:16 +0100

On Tue, Nov 14, 2023 at 02:43:10PM -0500, Jeff King wrote:
> On Wed, Nov 15, 2023 at 01:51:43AM +0900, Junio C Hamano wrote:
> 
> > >> Both of these are expected failures: we knowingly corrupt the repository
> > >> and circumvent git-gc(1)/git-maintenance(1), thus no commit-graphs are
> > >> updated. If we stick with the new stance that repository corruption
> > >> should not require us to pessimize the common case,...
> > >
> > > Yeah, just like we try to be extra careful while running fsck,
> > > because "--missing" is about finding these "corrupt" cases,
> > > triggering the paranoia mode upon seeing the option would make
> > > sense, no?  It would fix the failure in 6022, right?
> > >
> > > Thanks for working on this.
> > 
> > Just to make sure we do not miscommunicate, I do not think we want
> > to trigger the paranoia mode only in our tests.  We want to be
> > paranoid to help real users who used "--missing" for their real use,
> > so enabling PARANOIA in the test script is a wrong approach.  We
> > should enable it inside "rev-list --missing" codepath.
> 
> Yeah. Just like we auto-enabled GIT_REF_PARANOIA for git-gc, etc, I
> think we should do the same here.

I'm honestly still torn on this one. There are two cases that I can
think of where missing objects would be benign and where one wants to
use `git rev-list --missing`:

    - Repositories with promisor remotes, to find the boundary of where
      we need to fetch new objects.

    - Quarantine directories where you only intend to list new objects
      or find the boundary.

And in neither of those cases I can see a path for how the commit-graph
would contain such missing commits when using regular tooling to perform
repository maintenance.

So I'm still not sure why we think that this case is so much more
special than others. If a user wants to check for repository corruption
the tool shouldn't be `git rev-list --missing`, but git-fsck(1). To me,
the former is only useful in very specific circumstances where the user
knows what they are doing, and in none of the usecases I can think of
should we have a stale commit-graph _unless_ we have actual repository
corruption.

In reverse, to me this means that `--missing` is no more special than
any of the other low-level tooling, where our stance seems to be "We
assume that the repository is not corrupt". In that spirit, I'd argue
that the same default value should apply here as for all the other
cases.

But based on the discussion it very much feels like I'm alone with this
train of thought, which may indicate that I'm missing a quintessential
part of your arguments. May just as well be that I'm too focussed on the
usecases we at GitLab have for the new `--missing` behaviour around
commits that Karthik has just introduced.

Oh, well. I'll wait for answers to this reply until tomorrow, and if I
still haven't been able to convince anybody then I'll assume it's just
me and adapt accordingly :)

> As we are closing in on the v2.43 release, there's one thing I'm not
> sure about regarding release planning. Are these test cases that _used_
> to detect the corruption, but now don't? I.e., I would have expected
> that disabling GIT_COMMIT_GRAPH_PARANOIA would take us back to the same
> state as v2.42. But I think it doesn't because of the hunk in e04838ea82
> (commit-graph: introduce envvar to disable commit existence checks,
> 2023-10-31) that makes the has_object() call conditional (and now
> defaults to off).
> 
> What I'm getting as it that I think we have three options for v2.43:
> 
>   1. Ship what has been in the release candidates, which has a known
>      performance regression (though the severity is up for debate).

This seems like the best option for now in my opinion. The new behaviour
is not a bug, quite on the contrary, even though it is slower.

As Junio once said, we are not a "performance is king" project [1]. This
has burnt itself into my mind, and funny enough it was in the vicinity
of the change where I originally introduced the other object existence
check into `lookup_commit_in_graph()`.

[1]: <xmqqr1i1t6zl.fsf@gitster.g>

>   2. Flip the default to "0" (i.e., Patrick's patch in this thread). We
>      know that loosens some cases versus 2.42, which may be considered a
>      regression.

If we consider this to be a regression then I'd rather want to drop this
patch completely and leave it be. Ultimately, the question is how much
we trust our tooling to keep the commit-graph up-to-date, and whether or
not we need to account for corrupted repositories.

I for myself do trust the tooling, otherwise I wouldn't have sent this
patch. But I'm also happy to accept the current status where we are
being more thorough at the cost of performance.

>   3. Sort it out before the release. We're getting pretty close to do
>      a lot new work there, but I think the changes should be small-ish.
>      The nuclear option is ejecting the topic and re-doing it in the
>      next cycle.

I would be comfortable with this option if we simply switch the default
without adding special-casing for specific options like `--missing`. But
otherwise I'd rather not rush such a change.

Patrick

> I don't have a really strong preference between the three.
> 
> -Peff
Attachment:
signature.asc

Description: PGP signature