Re: [PATCH v1 1/4] fastindex: speed up index load through parallelization

Jeff King <peff@xxxxxxxx> · Mon, 20 Nov 2017 09:20:35 -0500

On Mon, Nov 20, 2017 at 09:01:45AM -0500, Ben Peart wrote:

> Further testing has revealed that switching from the regular heap to a
> refactored version of the mem_pool in fast-import.c produces similar gains
> as parallelizing do_index_load().  This appears to be a much simpler patch
> for similar gains so we will be pursuing that path.

That sounds like a pretty easy win for index entries, which tend to
stick around in big clumps.

Out of curiosity, have you tried experimenting with any high-performance
3rd-party allocator libraries? I've often wondered if we could get a
performance improvement from dropping in a new allocator, but was never
able to measure any real benefit over glibc's ptmalloc2. The situation
might be different on Windows, though (i.e., if the libc allocator isn't
that great).

Most of the high-performance allocators are focused on concurrency,
which usually isn't a big deal for git. But tcmalloc, at least, claims
to be about 6x faster than glibc.

The reason I ask is that we could possibly get the same wins without
writing a single line of code. And it could apply across the whole
code-base, not just the index code. I don't know how close a general
purpose allocator could come to a pooled implementation, though. You're
inherently making a tradeoff with a pool in not being able to free
individual entries.

-Peff