Re: [PATCH v3 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence

Taylor Blau <me@xxxxxxxxxxxx> · Thu, 28 Jan 2021 20:25:30 -0500

On Thu, Jan 28, 2021 at 07:53:34PM -0500, Jeff King wrote:
> On Mon, Jan 25, 2021 at 06:37:46PM -0500, Taylor Blau wrote:
>
> > When an on-disk reverse index exists, there is no need to generate one
> > in memory. In fact, doing so can be slow, and require large amounts of
> > the heap.
> >
> > Let's make sure that we treat the on-disk reverse index with precedence
> > (i.e., that when it exists, we don't bother trying to generate an
> > equivalent one in memory) by teaching Git how to conditionally die()
> > when generating a reverse index in memory.
> >
> > Then, add a test to ensure that when (a) an on-disk reverse index
> > exists, and (b) when setting GIT_TEST_REV_INDEX_DIE_IN_MEMORY, that we
> > do not die, implying that we read from the on-disk one.
>
> I don't love this kind of hackery, as it will have to live forever if we
> want to keep testing this feature. But I also don't know how else to
> tell in the regular test suite that we are avoiding the slow path.
>
> Would it be enough to instead add a t/perf test which checks the speed
> of:
>
>   echo HEAD | git cat-file --batch-check='%(objectsize:disk)'
>
> ? I hate relying on the perf suite, because it gets run way less often
> (and requires a lot of squinting to interpret the results). But it
> wouldn't require any extra code the binary, as it's observing the actual
> user-visible thing we care about: speed.
>
> (I guess we care as much or more about peak heap usage, but measuring
> that is a whole other can of worms).

Yeah, I think that the thing we care most about is peak RSS, but I agree
that I don't really want to wade into figuring out how to measure that
reliably during test time (I think there is a separate and less relevant
argument about whether that is something that we should be testing or
not).

But, getting back to your original comment, I think that I actually
prefer to have the GIT_TEST_XYZ_DIE stuff in the binary than I do
relying on the perf suite to catch stuff like this.

I understand your concern about the binary size. I guess you could
#ifdef this out and only build it in during the test suite, but then
you're testing a different binary, and so that calls into question the
integrity of the test suite itself, etc. etc.

So, I guess that's all to say that I while I do find this to be hack-y
and gross, I don't think that it's all that bad when you compare it to
the alternatives.

As usual, I'm happy to change it if you feel strongly that it should be
changed.

> -Peff

Thanks,
Taylor