On 3/8/2021 7:09 PM, Elijah Newren via GitGitGadget wrote:> The basic idea here is: > > We only need expensive rename detection on the subset of files changed on > both sides of history (for the most part). > > This is because: > > 1. The primary reason for rename detection in merges is enabling three-way > content merges > 2. The purpose of three-way content merges is reconciling changes when > > both sides of history modified some file 3. If a file was only modified by > the side that renamed the file, then detecting the rename is irrelevant; > we'll get the same answer without knowing about the rename. 4. (Well...there > are rare cases where we need the rename for reasons other than three-way > content merges. Patch 5 explains those.) Makes sense. Don't compute information you won't need. I look forward to trying to figure out the special cases here. > Before Series After Series > no-renames: 12.596 s ± 0.061 s 5.680 s ± 0.096 s > mega-renames: 130.465 s ± 0.259 s 13.812 s ± 0.162 s > just-one-mega: 3.958 s ± 0.010 s 506.0 ms ± 3.9 ms These are _very_ impressive numbers for such a "simple" idea. > However, interestingly, if we had ignored the basename-guided rename > detection optimizations[2][3], then this optimization series would have > improved the performance as follows: > > Before Basename Series After Just This Series > no-renames: 13.815 s ± 0.062 s 5.728 s ± 0.104 s > mega-renames: 1799.937 s ± 0.493 s 18.213 s ± 0.139 s > just-one-mega 51.289 s ± 0.019 s 891.9 ms ± 7.0 ms And here it is even more impressive. I see that your optimizations are having combined effects but also are doing valuable things on their own. > We get best results by prioritizing them as follows: > > 1. exact rename detection > 2. skip-because-unnecessary > 3. basename-guided rename detection This makes sense to me, since even the basename-guided rename is doing some non-trivial work. It would be good to reduce that effort. > it means that our remaining optimization potential is > somewhat limited, and subsequent optimization series will have to fight for > much smaller gains. This is a good place to end up. Let the code rest for a bit after we are done here, and maybe we'll find new cases to care about later. We could chase the long tail forever, but these steps are a huge accomplishment! Getting to reading now. -Stolee