Re: [PATCH v3 2/4] revision: stop retrieving reference twice

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 02 Aug 2021 14:53:30 +0200

On Mon, Aug 02 2021, Patrick Steinhardt wrote:

> [[PGP Signed Part:Undecided]]
> When queueing up references for the revision walk, `handle_one_ref()`
> will resolve the reference's object ID and then queue the ID as pending
> object via `add_pending_oid()`. But `add_pending_oid()` will again try
> to resolve the object ID to an object, effectively duplicating the work
> its caller already did before.
>
> Fix the issue by instead calling `add_pending_object()`, which takes the
> already-resolved object as input. In a repository with lots of refs,
> this translates in a nearly 10% speedup:
>
>     Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
>       Time (mean ± σ):      5.015 s ±  0.038 s    [User: 4.698 s, System: 0.316 s]
>       Range (min … max):    4.970 s …  5.089 s    10 runs
>
>     Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
>       Time (mean ± σ):      4.606 s ±  0.029 s    [User: 4.260 s, System: 0.345 s]
>       Range (min … max):    4.565 s …  4.657 s    10 runs
>
>     Summary
>       'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
>         1.09 ± 0.01 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'

It might be worth calling out explicitly that it's not just
"effectively", but that add_pending_oid() is just a thin wrapper for
get_reference() followed by add_pending_object(), so we're guaranteed to
get the exact same result here as before, just without the duplicate
work.

I.e. we're not going down some other lookup path that uses object
lookups with different flags or whatever as a result of this change.