On Mon, Jul 11, 2011 at 23:10, Tay Ray Chuan <rctay89@xxxxxxxxx> wrote: > (Shawn, I was held up with the patch messages, sorry for the delay.) > > Port JGit's HistogramDiff(Index) over to C. This algorithm extends the > patience algorithm to "support low-occurrence common elements" [1]. > > Rough numbers show that it is a faster alternative to its --patience > cousin, as well as to the default Meyers algorithm: > > $ time ./git log --histogram -p v1.0.0 >/dev/null > > real 0m12.998s > user 0m11.506s > sys 0m1.487s > $ time ./git log -p v1.0.0 >/dev/null > > real 0m13.575s > user 0m12.101s > sys 0m1.468s > $ time ./git log --patience -p v1.0.0 >/dev/null > > real 0m14.978s > user 0m13.508s > sys 0m1.464s Nice! Not the big difference that it is for us in JGit (between histogram and Myers), but its nice to see an improvement here, even if it is only 0.5s for the entire 1.0.0 history. How do the diffs come out? One of the arguments for patience diff is the formatting can sometimes be more readable for certain changes, but its slower. Histogram tries to apply a similar algorithm as patience in order to get the formatting benefits, but also some performance improvements. Have you looked at a patch that differs in output between Myers and patience, and then compared those to the histogram version? > The first patch implements JGit's HistogramDiff(Index) proper. The > second and third patches aren't essential but yield performance gains. ... > [RFC/PATCH 1/3] teach --histogram to diff > [RFC/PATCH 2/3] xdiff/xprepare: skip classification > [RFC/PATCH 3/3] xdiff/xprepare: use a smaller sample size for histogram Do we need sampling at all for histogram? Can you skip it? -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html