Re: absurdly slow git-diff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 7 Nov 2008, Linus Torvalds wrote:

> On Sat, 8 Nov 2008, Abhijit Menon-Sen wrote:
> >
> > If anyone's interested, the files are http://toroid.org/misc/1 and
> > http://toroid.org/misc/2
> 
> Btw, you can see this by just doing
> 
> 	git diff 1 2
> 
> without even doing "git init" or doing any actual git repository.
> 
> > Does anyone understand why this slowdown might happen or have
> > suggestions about where I should look for it?
> 
> Sure. It's actually fairly simple. You're hitting a O(n^2) thing (possibly 
> higher), and it's triggered by the fact that almost all your lines are 
> identical, ie you have a file that basically has 40,000 lines of each of
> 
> 	xxxx: xxx, xx xxx xxxx xx:xx:xx +xxxx
> 	xx: xxxx xxxxxxxxxxx <xxxx@xxxxxxxx>
> 	xxxx: xxxx xxxxxxxxxxx <xxxx@xxxxxxxx>
> 
> and 30,000 of 
> 
> 	* xxxxx xxxxx (xxxxxx.xxxx xxx xxxxx (\Xxxxxx) xxxxxx.xxxxxx {xxx}
> 
> with a smattering of others. And this is a case where the internal git 
> implementation does really badly. And nobody has really cared before, 
> because nobody has ever had a case that mattered.
> 
> There's a number of different 'diff' algorithms, and it looks like GNU 
> diff has one that avoids the O(n^2) case for this case.
> 
> I'm adding Davide as the original author of the diff library to the cc. 
> I'm also adding Pierre, since he was talking about trying to implement
> another diff algorithm (although I'm not at all sure that the patience 
> diff really would help this case at all).

That should be an easy fix. Just need to limit the window by which 
xdl_clean_mmatch() scans the current position.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux