Re: [RFC PATCH] index-pack: improve performance on NFS

Jeff King <peff@xxxxxxxx> · Thu, 8 Nov 2018 07:02:57 -0500

On Wed, Nov 07, 2018 at 10:55:24PM +0000, Geert Jansen wrote:

> On Mon, Oct 29, 2018 at 07:27:39PM -0400, Jeff King wrote:
> 
> > On Mon, Oct 29, 2018 at 08:36:07PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > >  * Re-roll my 4 patch series to include the patch you have in
> > >    <20181027093300.GA23974@xxxxxxxxxxxxxxxxxxxxx>
> > 
> > I don't think it's quite ready for inclusion as-is. I hope to brush it
> > up a bit, but I have quite a backlog of stuff to review, as well.
> 
> We're still quite keen to get this patch included. Is there anything I can do
> to help?

Yes, testing and review. :)

I won't send the series out just yet, as I suspect it could use another
read-through on my part. But if you want to peek at it or try some
timings, it's available at:

  https://github.com/peff/git jk/loose-cache

It's quite a bit bigger than the original patch, as some refactoring was
necessary to reuse the existing cache in alternate_object_directories.
I'm rather pleased with how it turned out; it unifies the handling of
alternates and the main object directory, which is a cleanup I've been
wanting to do for some time.

> Also I just re-read your comments on maximum cache size. I think you were
> arguing both sides of the equation and I wasn't sure where you'd ended up. :)
> A larger cache size potentially takes more time to fill up especially on NFS
> while a smaller cache size obviously would less effective. That said a small
> cache is still effective for the "clone" case where the repo is empty.

I ended up thinking that a large cache is going to be fine. So I didn't
even bother implementing a limit in my series, which makes things a bit
simpler (it's one less state to deal with).

Since it reuses the existing cache code, it's better in a few ways than
my earlier patch:

  1. If a program uses OBJECT_INFO_QUICK and prints abbreviated sha1s,
     we only have to load the cache once (I think fetch does this, but I
     didn't test it).

  2. The cache is filled one directory at a time, which avoids
     unnecessary work when there are only a few lookups.

  3. The cache is per-object-directory. So if a request can be filled
     without looking at an alternate, we avoid looking at the alternate.
     I doubt this matters much in practice (the case we care about is
     when we _don't_ have the object, and there you have to look
     everywhere).

The one thing I didn't implement is a config option to disable this.
That would be pretty easy to add. I don't think it's necessary, but it
would make testing before/after behavior easier if somebody thinks it's
slowing down their particular case.

> It also occurred to me that as a performance optimization your patch could read
> the the loose object directories in parallel using a thread pool. At least on
> Amazon EFS this should result in al almost linear performance increase. I'm not
> sure how much this would help for local file systems. In any case this may be
> best done as a follow-up patch (that I'd be happy to volunteer for).

Yeah, I suspect it could make things faster in some cases. But it also
implies filling all of the cache directories at once up front. The code
I have now tries to avoid unnecessary cache fills. But it would be
pretty easy to kick off a full fill.

I agree it would make more sense as a follow-up patch (and probably
controlled by a config option, since it likely only makes sense when you
have a really high-latency readdir).

-Peff