Re: [PATCH] hash: reduce size of algo member of object ID

Jeff King <peff@xxxxxxxx> · Mon, 4 Oct 2021 04:20:44 -0400

On Mon, Oct 04, 2021 at 04:13:34AM -0400, Jeff King wrote:

> It looks like adding the "algo" field did make a big difference for the
> oid_array case, but changing it to a char doesn't seem to help at all:
> 
>   $ hyperfine -L v none,int,char './git.{v} cat-file --batch-all-objects --batch-check="%(objectname)"'
>   Benchmark #1: ./git.none cat-file --batch-all-objects --batch-check="%(objectname)"
>     Time (mean ± σ):      1.653 s ±  0.009 s    [User: 1.607 s, System: 0.046 s]
>     Range (min … max):    1.640 s …  1.670 s    10 runs
>    
>   Benchmark #2: ./git.int cat-file --batch-all-objects --batch-check="%(objectname)"
>     Time (mean ± σ):      1.067 s ±  0.012 s    [User: 1.017 s, System: 0.050 s]
>     Range (min … max):    1.053 s …  1.089 s    10 runs
>    
>   Benchmark #3: ./git.char cat-file --batch-all-objects --batch-check="%(objectname)"
>     Time (mean ± σ):      1.092 s ±  0.013 s    [User: 1.046 s, System: 0.046 s]
>     Range (min … max):    1.080 s …  1.116 s    10 runs
>    
>   Summary
>     './git.int cat-file --batch-all-objects --batch-check="%(objectname)"' ran
>       1.02 ± 0.02 times faster than './git.char cat-file --batch-all-objects --batch-check="%(objectname)"'
>       1.55 ± 0.02 times faster than './git.none cat-file --batch-all-objects --batch-check="%(objectname)"'
> 
> I'm actually surprised it had this much of an impact. But I guess this
> benchmark really is mostly just memcpy-ing oids into a big array,
> sorting it, and printing the result. If that array is 12% bigger, we'd
> expect at least a 12% speedup. But adding in non-linear elements like
> growing the array (though I guess that is amortized linear) and sorting
> (which picks up an extra log(n) term) make the difference.
> 
> It's _kind of_ silly in a sense, since usually you'd ask for other parts
> of the object, which will make the speed difference relatively smaller.
> But just dumping a bunch of oids is actually not an unreasonable thing
> to do. I suspect it got a lot slower with 32-byte GIT_MAX_RAWSZ, too
> (even when you're using 20-byte sha1), but I don't think there's an easy
> way to get out of that.

Oh wait, I'm reading it totally wrong. Adding in the extra 4 bytes
actually made it _faster_ than not having an algo field. Now I'm
super-confused. I could believe that it gave us some better alignment,
but the original struct was 32 bytes. 36 seems like a strictly worse
number.

-Peff