Re: [PATCH 2/3] read-cache: read on-disk entries in byte order

Jeff King <peff@xxxxxxxx> · Thu, 29 Sep 2022 07:27:14 -0400

On Wed, Sep 28, 2022 at 12:21:15AM -0400, Jeff King wrote:

> The downside is that we're copying the data an extra time. It's not very
> much data, and it's all fixed size, so the compiler should be able to do
> a reasonable job of optimizing here. But I didn't time the potential
> impact.

I timed this using "test-tool read-cache". It's kind of an artificial
benchmark, but it does isolate the code we're touching here. The results
are...not good. Here's the time to read the index of linux.git 1000
times, before and after this reordering patch:

  Benchmark 1: ./test-tool.old read-cache 1000
    Time (mean ± σ):      2.870 s ±  0.073 s    [User: 2.555 s, System: 0.315 s]
    Range (min … max):    2.789 s …  3.001 s    10 runs

  Benchmark 2: ./test-tool.new read-cache 1000
    Time (mean ± σ):      3.180 s ±  0.080 s    [User: 2.849 s, System: 0.331 s]
    Range (min … max):    3.092 s …  3.297 s    10 runs

  Summary
    './test-tool.old read-cache 1000' ran
      1.11 ± 0.04 times faster than './test-tool.new read-cache 1000'

I think that's probably the nail in the coffin for my proposed approach.
To be fair, it's only .3ms extra for a normal program which reads the
index once. That's not that big in absolute numbers. But there are
larger index files in the wild. And the improvement in simplicity and
readability is simply not that great.

-Peff