Re: [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common()

Derrick Stolee <stolee@xxxxxxxxx> · Wed, 2 May 2018 09:42:42 -0400

On 5/2/2018 9:05 AM, Jakub Narebski wrote:
Derrick Stolee <stolee@xxxxxxxxx> writes:
     For a copy of the Linux repository, we measured the following
     performance improvements:

     git merge-base v3.3 v4.5

     Before: 234 ms
      After: 208 ms
      Rel %: -11%

     git merge-base v4.3 v4.5

     Before: 102 ms
      After:  83 ms
      Rel %: -19%

     The experiments above were chosen to demonstrate that we are
     improving the filtering of the merge-base set. In the first
     example, more time is spent walking the history to find the
     set of merge bases before the remove_redundant() call. The
     starting commits are closer together in the second example,
     therefore more time is spent in remove_redundant(). The relative
     change in performance differs as expected.
Nice.

I was not expecting as much performance improvements as we got for
--contains tests because remove_redundant() is a final step in longer
process, dominated by man calculations.  Still, nothing to sneeze about.

One reason these numbers are not too surprising is that 
remove_redundant() can demonstrate quadratic behavior. It is calculating 
pair-wise reachability by starting a walk at each of the candidates (in 
the worst case). In typical cases, the first walk marks many of the 
other candidates as redundant and we don't need to start walks from 
those commits.

A possible optimization could be to sort the candidates by descending 
generation so we find the first walk is likely to mark the rest as 
redundant. But this may already be the case if the candidates are added 
to the list in order of "discovery" which is already simulating this 
behavior.

Thanks,
-Stolee