Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes

Davide Libenzi <davidel@xxxxxxxxxxxxxxx> · Mon, 16 Oct 2006 11:18:22 -0700 (PDT)

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> On Mon, 16 Oct 2006, Jim Meyering wrote:
> > 
> > That helps a little.
> > Now, instead of taking 63s, my test takes ~30s.
> > (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)
> 
> Btw, what architecture is this on?
> 
> I'm testing those two files, and I get much more reasonable numbers with 
> both ppc32 and x86. Both 32-bit:
> 
> 	[torvalds@macmini test-perf]$ time git show | wc -l
> 	25221
> 
> 	real    0m1.437s
> 	user    0m1.436s
> 	sys     0m0.012s
> 
> ie it generated the diff in less than a second and a half. Not wonderful, 
> but certainly not your 63s either.
> 
> HOWEVER. On x86-64, it takes forever (still not 63 seconds, but it takes 
> 17 seconds on my 2GHz merom machine).
> 
> So I think there's something seriously broken with hashing on 64-bit. 
> 
> And I think I know what it is.
> 
> Try this patch. And make sure to do a "make clean" first, since I think 
> the dependencies on xdiff may be broken.
> 
> Davide: there's two things wrong with your old XDL_HASHLONG():
> 
>  - the GR_PRIME was just 32-bit, so it wouldn't shift low bits up far 
>    enough on a 64-bit architecture, so then shifting things down caused 
>    pretty much everything to be very small.
> 
>  - The whole idea of shifting up by multiplying and then shifting down to 
>    get the high bits is _broken_. Even on 32-bit architectures. Think 
>    about what happens when "hashbits" is 16 on a 32-bit architecture: the 
>    multiply moves the low bits _up_, but it doesn't move the high bits 
>    _down_. And with hashbits being a large fraction of the whole word, you 
>    need to shift things down, not up.
> 
> So just making GR_PRIME be a bigger value on a 64-bit architecture would 
> not have fixed it. The whole hash was simply broken. Do it the sane and 
> obvious way instead: always pick the low bits, but mix in upper bits there 
> too..

Yeah, using an appropriate golden ratio prime for 64 bits fixes it. I 
think it's the best/minimal fix (use 0x9e37fffffffc0001UL, like the 
kernel does).
I'm also looking into optimizing the multi-match discard loop, that 
actually loses the classifier informations collected in the context 
prepare phase.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html