On Tue, Oct 02, 2007 at 07:28:19PM -0700, Linus Torvalds wrote: > Sadly, that's not the case. It *does* seem to beat the current > implementation, but it's not "beat the pants off". It looks like an > improvement of about 15%, which is nothing to sneeze at, but it's not an > order-of-magnitude improvement either. > > Here's a test-patch. I don't guarantee anything, except that when I did > the timings I also did a "wc" on the result, and they matched.. I get slightly better speedups with my pathological case (around 30%): Before: $ /usr/bin/time git-diff --raw -M -l0 06d288^ 06d288 >/dev/null 105.38user 3.65system 2:14.90elapsed 80%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (15432major+542627minor)pagefaults 0swaps After: $ /usr/bin/time git-diff --raw -M -l0 06d288^ 06d288 >/dev/null 71.70user 3.47system 1:40.43elapsed 74%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (15065major+551778minor)pagefaults 0swaps But yes, it's not the order of magnitude we were looking for. > [torvalds@woody linux]$ time git diff -l0 --stat -C v2.6.22.. | wc I found less noise in the timing by using --raw, since the patch computation takes an appreciable amount of time. > but the diff is fairly simple, so if somebody will go over it and say > whether it's likely to be *correct* too, that 15% may well be worth it. Patch looks correct, and it produces correct results on my (admittedly limited) test data. I think it's worth applying (though I agree that a comment on the assumption of a zero "cnt" at the end is worth adding) unless some drastically different solution comes along (e.g., David's idea to try avoiding the outer O(n^2) loop). But I don't think there is much more to be gained from a different approach to comparing the two hash tables. -Peff - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html