On Fri, Feb 18, 2011 at 03:27:36PM -0800, Linus Torvalds wrote: > > Â1. Did you bump up your merge.renamelimit? It's hard to see because it > > Â Â scrolls off the screen amidst the many conflicts, but the first > > Â Â message is: > > > > Â Â Â warning: too many files (created: 425 deleted: 1093), skipping > > Â Â Â inexact rename detection > > > > Â Â which you want to use. Try "git config merge.renamelimit > > Â Â 10000". Which runs pretty snappily on my machine; I wonder if we > > Â Â should up the default limit. > > Yeah, for the kernel, I have > > [diff] > renamelimit=0 > > to disable the limit entirely, because the default limit is very low > indeed. Git is quite good at the rename detection. > > However, the reason for the low default is not because it's not snappy > enough - it's because it can end up using a lot of memory (and if > you're low on memory, the swapping will mean that it goes from "quite > snappy" to "slow as molasses" - but it still will not be CPU limited, > it's just paging like crazy). I think it can be both. There is an O(n^2) part to the algorithm. I did some timings a few years ago that showed an n^2 increase in time as you bumped the limit: http://article.gmane.org/gmane.comp.version-control.git/73519 That's staying within a reasonable memory size. I would not be surprised if you can get much worse behavior by going into swap, but I didn't measure peak memory use there. Those tests led to: commit 50705915eae89eae490dff30fa370ed02e4d6e72 Author: Jeff King <peff@xxxxxxxx> Date: Wed Apr 30 13:24:43 2008 -0400 bump rename limit defaults The current rename limit default of 100 was arbitrarily chosen. Testing[1] has shown that on modern hardware, a limit of 200 adds about a second of computation time, and a limit of 500 adds about 5 seconds of computation time. This patch bumps the default limit to 200 for viewing diffs, and to 500 for performing a merge. The limit for generating git-status templates is set independently; we bump it up to 200 here, as well, to match the diff limit. But perhaps it's time to revisit the test; it's been 2 years, and my hardware at the time was probably 2 years out of date. :) Here are the old and new times for various sizes of rename. Details about the test are in the message referenced above. N Old CPU Seconds New CPU Seconds 10 0.43 0.02 100 0.44 0.20 200 1.40 0.55 400 4.87 1.90 800 18.08 7.01 1000 27.82 10.83 So maybe bump the diff limit to 400 and the merge limit to 1000, doubling both? That leaves us at around 2 seconds per-commit for a log, and 10 seconds tacked onto a merge. We could maybe even go higher with the merge limit. If it's such a big merge, the conflict resolution is probably going to take forever anyway, so 30 extra seconds if it makes rename detection work is probably a good thing. According to top, git only hit around 17M resident on the 1000-sized one, so I don't think memory is a problem, at least for average repos (and yes, I know top is an awful way to measure, but it's quick and it would need to be orders of magnitude off for it to be a problem). So I'm in favor of bumping the limits, or possibly even removing the hard number limit and putting in a "try to do renames for this many seconds" option. If we're going to have something like 30 second delays on merge, though, we should perhaps write some eye candy to stderr after 2 seconds or so (like we do with "git checkout"). > So I do think we could try to lift the default a bit, but it might be > even more important to just make the message much more noticeable and > avoid scrolling past it. For example, setting a flag, and not printing > out the message immediately, but instead print it out only if it turns > into trouble at the end. Yeah, I also think that would be useful. And if that information filters up to the merge command, it can even give better advice (like how to tweak the limit). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html