Re: [PATCH] Teach "git add" and friends to be paranoid

Paolo Bonzini <bonzini@xxxxxxx> · Mon, 22 Feb 2010 13:59:50 +0100

On 02/18/2010 06:36 AM, Junio C Hamano wrote:
Nicolas Pitre<nico@xxxxxxxxxxx>  writes:

It is likely to have better performance if the buffer is small enough to
fit in the CPU L1 cache.  There are two sequencial passes over the
buffer: one for the SHA1 computation, and another for the compression,
and currently they're sure to trash the L1 cache on each pass.

I did a very unscientific test to hash about 14k paths (arch/ and fs/ from
the kernel source) using "git-hash-object -w --stdin-paths" into an empty
repository with varying sizes of paranoia buffer (quarter, 1, 4, 8 and
256kB) and saw 8-30% overhead.  256kB did hurt and around 4kB seemed to be
optimal for my this small sample load.

In any case, with any size of paranoia, this hurts the sane use case

Because by mmaping + memcpying you are getting the worst of both cases: 
you get a page fault per page like with mmap, and touch memory twice 
like with read.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html