Re: [RFC PATCH] index-pack: improve performance on NFS

Jeff King <peff@xxxxxxxx> · Mon, 29 Oct 2018 17:50:50 -0400

On Mon, Oct 29, 2018 at 09:34:53PM +0000, Geert Jansen wrote:

> On Mon, Oct 29, 2018 at 09:48:02AM +0900, Junio C Hamano wrote:
> 
> > A real question is how much performance gain, relative to ".cloning"
> > thing, this approach gives us.  If it gives us 80% or more of the
> > gain compared to doing no checking, I'd say we have a clear winner.
> 
> I've tested Jeff's loose-object-cache patch and the performance is within error
> bounds of my .cloning patch. A git clone of the same repo as in my initial
> tests:
> 
>   .cloning -> 10m04
>   loose-object-cache -> 9m59
> 
> Jeff's patch does a little more work (256 readdir() calls, which in case of an
> empty repo translate into 256 LOOKUP calls that return NFS4ERR_NOENT) but that
> appears to be insignificant.

Yep, that makes sense. Thanks for timing it.

> I believe the loose-object-cache approach would have a performance regression
> when you're receiving a small pack file and there's many loose objects in the
> repo. Basically you're trading off
> 
>     MIN(256, num_objects_in_pack / dentries_per_readdir) * readdir_latency
>     
> against
> 
>     num_loose_objects * stat_latency

Should num_loose_objects and num_objects_in_pack be swapped here? Just
making sure I understand what you're saying.

The patch I showed just blindly reads each of the 256 object
subdirectories. I think if we pursue this (and it seems like everybody
is on board), we should cache each of those individually. So a single
object would incur at most one opendir/readdir (and subsequent objects
may, too, or they may hit that cache if they share the first byte).

So the 256 in your MIN() is potentially much smaller. We still have to
deal with the fact that if you have a large number of loose objects,
they may be split cross multiple readdir (or getdents) calls. The "cache
maximum" we discussed does bound that, but in some ways that's worse:
you waste time doing the bounded amount of readdir and then don't even
get the benefit of the cache. ;)

> On Amazon EFS (and I expect on other NFS server implementations too) it is more
> efficient to do readdir() on a large directory than to stat() each of the
> individual files in the same directory. I don't have exact numbers but based on
> a very rough calculation the difference is roughly 10x for large directories
> under normal circumstances.

I'd expect readdir() to be much faster than stat() in general (e.g., "ls
-f" versus "ls -l" is faster even on a warm cache; there's more
formatting going on in the latter, but I think a lot of it is the effort
to stat).

-Peff