Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

Jeff King <peff@xxxxxxxx> · Mon, 12 Nov 2018 11:21:51 -0500

On Mon, Nov 12, 2018 at 05:01:02PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > There's some obvious hand-waving in the paragraphs above. I would love
> > it if somebody with an NFS system could do some before/after timings
> > with various numbers of loose objects, to get a sense of where the
> > breakeven point is.
> >
> > My gut is that we do not need the complexity of a cache-size limit, nor
> > of a config option to disable this. But it would be nice to have a real
> > number where "reasonable" ends and "pathological" begins. :)
> 
> I'm happy to test this on some of the NFS we have locally, and started
> out with a plan to write some for-loop using the low-level API (so it
> would look up all 256), fake populate .git/objects/?? with N number of
> objects etc, but ran out of time.
> 
> Do you have something ready that you think would be representative and I
> could just run? If not I'll try to pick this up again...

No, but they don't even really need to be actual objects. So I suspect
something like:

  git init
  for i in $(seq 256); do
    i=$(printf %02x $i)
    mkdir -p .git/objects/$i
    for j in $(seq --format=%038g 1000); do
      echo foo >.git/objects/$i/$j
    done
  done
  git index-pack -v --stdin </path/to/git.git/objects/pack/XYZ.pack

might work (for various values of 1000). The shell loop would probably
be faster as perl, too. :)

Make sure you clear the object directory between runs, though (otherwise
the subsequent index-pack's really do find collisions and spend time
accessing the objects).

If you want real objects, you could probably just dump a bunch of
sequential blobs to fast-import, and then pipe the result to
unpack-objects.

-Peff