On 5/2/2018 9:05 AM, Jakub Narebski wrote:
Derrick Stolee <stolee@xxxxxxxxx> writes:
For a copy of the Linux repository, we measured the following
performance improvements:
git merge-base v3.3 v4.5
Before: 234 ms
After: 208 ms
Rel %: -11%
git merge-base v4.3 v4.5
Before: 102 ms
After: 83 ms
Rel %: -19%
The experiments above were chosen to demonstrate that we are
improving the filtering of the merge-base set. In the first
example, more time is spent walking the history to find the
set of merge bases before the remove_redundant() call. The
starting commits are closer together in the second example,
therefore more time is spent in remove_redundant(). The relative
change in performance differs as expected.
Nice.
I was not expecting as much performance improvements as we got for
--contains tests because remove_redundant() is a final step in longer
process, dominated by man calculations. Still, nothing to sneeze about.
One reason these numbers are not too surprising is that
remove_redundant() can demonstrate quadratic behavior. It is calculating
pair-wise reachability by starting a walk at each of the candidates (in
the worst case). In typical cases, the first walk marks many of the
other candidates as redundant and we don't need to start walks from
those commits.
A possible optimization could be to sort the candidates by descending
generation so we find the first walk is likely to mark the rest as
redundant. But this may already be the case if the candidates are added
to the list in order of "discovery" which is already simulating this
behavior.
Thanks,
-Stolee