Re: [PATCH 0/3] pack-bitmap: boundary-based bitmap traversal

Derrick Stolee <derrickstolee@xxxxxxxxxx> · Tue, 25 Apr 2023 14:30:01 -0400

On 4/25/2023 2:02 PM, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
> 
>> This improves many cases where using bitmaps was significantly *slower*
>> than regular, non-bitmap traversal. In some instances, using bitmaps is
>> still slower than the non-bitmap case, but it is a substantial
>> improvement over the classic bitmap walk.
>> ...
>> In a large repository on GitHub.com, the timings to compute the objects
>> unique to "master", as in:
>>
>>     $ git rev-list --count --objects master --not --exclude=master --branches
>>
>> improve as follows:
> 
> Curious---when would it be significantly faster to use bitmaps?
> "Most of the time"?  "In not-too-large repository?"

This is an interesting question that is very difficult to predict
in advance.

When building an independent implementation of reachability bitmaps
for Azure Repos, we never managed to make the bitmap method faster
for incremental fetches, on average, and so only enabled them during
full clones (no haves).

Of course, it's relatively easy to construct something that looks
like an incremental fetch but is as expensive as a clone (include
one have being a root commit), but it is very rare in practice.

It would be interesting if we used the initial commit walk to the
boundary as a predictor of whether bitmaps would be good, or if
we should use the tree-parsing walk instead.

But perhaps this series gets the bitmap walk "close enough" in
the typical case that it's better to use bitmaps in the chance that
we have accidentally hit a case where we'd normally need to walk a
lot of trees.

Thanks,
-Stolee