Re: [PATCH] check_everything_connected: assume alternate ref tips are valid

Derrick Stolee <stolee@xxxxxxxxx> · Fri, 28 Jun 2019 08:51:04 -0400

On 6/28/2019 6:11 AM, Jeff King wrote:
> When we receive a remote ref update to sha1 "X", we want to check that
> we have all of the objects needed by "X". We can assume that our
> repository is not currently corrupted, and therefore if we have a ref
> pointing at "Y", we have all of its objects. So we can stop our
> traversal from "X" as soon as we hit "Y".
> 
> If we make the same non-corruption assumption about any repositories we
> use to store alternates, then we can also use their ref tips to shorten
> the traversal.

I was confused by this paragraph, because I didn't know about
for_each_alternate_ref() and how refs_From_alternate_cb() will
strip the "/objects" and append "/refs" to check refs if they
exist. All of that logic is in transport.c but used by
fetch-pack.c and builtin/receive-pack.c. But now we are adding
to revision.c, so the restriction to "this helps data transfer"
is getting murkier.

Is this something that should be extracted to the object-store
layer? Or is it so tricky to use that we shouldn't make it too
easy to fall into a bad pattern?

> This is especially useful when cloning with "--reference", as we
> otherwise do not have any local refs to check against, and have to
> traverse the whole history, even though the other side may have sent us
> few or no objects. Here are results for the included perf test (which
> shows off more or less the maximal savings, getting one new commit and
> sharing the whole history):
> 
> Test                        HEAD^             HEAD
> --------------------------------------------------------------------
> [on git.git]
> 5600.3: clone --reference   2.94(2.86+0.08)   0.09(0.08+0.01) -96.9%
> [on linux.git]
> 5600.3: clone --reference   45.74(45.34+0.41)   0.36(0.30+0.08) -99.2%

It's really hard to argue with numbers like these. Kudos!

> 
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
>  Documentation/rev-list-options.txt |  8 ++++
>  connected.c                        |  1 +
>  revision.c                         | 30 +++++++++++++++
>  t/perf/p5600-clone-reference.sh    | 27 ++++++++++++++
>  t/t5618-alternate-refs.sh          | 60 ++++++++++++++++++++++++++++++

Other than the high-level questions above, the code and tests look
good to me.

Thanks,
-Stolee