On Thu, Apr 20, 2017 at 12:40:52PM +0200, Johannes Schindelin wrote: > > > Teach register_rename_src() to see if new file pair can simply be > > > appended to the rename_src[] array before performing the binary search > > > to find the proper insertion point. > > > > I guess your perf results show some minor improvement. But I suspect > > this is because your synthetic repo does not resemble the real world > > very much. > > Please note that the synthetic test repo was added *after* coming up with > the patch, *after* performance benchmarking on a certain really big > repository (it is not hard to guess what use case we are optimizing, > right?). > > In that light, I would like to register the fact that Jeff's performance > work is trying to improve a very real world, that of more than 2,000 > developers in our company [*1*]. Sure; I didn't think it came out of thin air. What are the benchmarks on this real-world repository, then? Specifically, it looks like this optimization isn't really about the number of files in the repository so much as the number of additions/deletions in a particular diff (which is what become rename sources and destinations). Is it common to add or delete 4 million tiny files and then run "git status"? Note that I think the optimization probably _is_ worth doing in the general case. These "is it sorted" tradeoffs can backfire if we sometimes get unsorted input, but I don't think that would ever be the case here. My main complaint is not that it's not worth doing, but that I'm not excited about sprinkling these checks ad-hoc throughout the code base. -Peff