On Fri, Sep 23, 2011 at 1:51 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> Alexander Pepper <pepper@xxxxxxxxxxxxxxxx> writes: >> >>> Am 21.09.2011 um 14:24 schrieb Junio C Hamano: >>>>> $ git log --numstat 48a07e7e533f507228e8d1c99d4d48e175e14260 >>>>> [...] >>>>> 11 10 src/java/voldemort/server/storage/StorageService.java >>>> >>>> Didn't we update it this already? I seem to get 10/9 here not 11/10. >>> >>> Current 'maint' (cd2b8ae9), 'master' (4b5eac7f)... >> >> That's a tad old master you seem to have. >> >> Strangely, bisection points at 27af01d5523, which was supposed to be only >> about performance and never about correctness. There is something fishy >> going on.... > > In any case, I think the real issue is that depending on how much context > you ask, the resulting diff is different (and both are valid diffs). If > you ask "log -p" (or "diff" or "show") to produce a patch, then we use the > default 3-line context. And then you feed that to an external diffstat to > count the number of deleted and added lines to get one set of numbers. > > The --numstat (and --diffstat) code seems to be running the internal diff > machinery with 0-line context and counting the resulting diff internally. > > And of course the results between the above two would be different because > diff can match lines differently when given different number of context > lines to include in the result. > > So perhaps a good sanity-check for you to try (note: not checking your > sanity, but checking the sanity of the above analysis) would be to do: > > $ git show 48a07e7e53 -- $that_path | diffstat > $ git show -U0 48a07e7e53 -- $that_path | diffstat > $ git show --numstat 48a07e7e53 -- $that_path > $ git show --stat 48a07e7e53 -- $that_path > > and see how they compare (make sure to use the same version of git for > these experiments). The first one uses the default 3-lines context, the > second one forces 0-line context, and the last two uses 0-line context > hardwired in the code. > > Applying the following patch should make the last two use the default > context or -U$num given from the command line to be consistent with the > codepath where we generate textual patches. > > diff.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/diff.c b/diff.c > index 9038f19..302ef33 100644 > --- a/diff.c > +++ b/diff.c > @@ -2251,6 +2251,8 @@ static void builtin_diffstat(const char *name_a, const char *name_b, > memset(&xpp, 0, sizeof(xpp)); > memset(&xecfg, 0, sizeof(xecfg)); > xpp.flags = o->xdl_opts; > + xecfg.ctxlen = o->context; > + xecfg.interhunkctxlen = o->interhunkcontext; > xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat, > &xpp, &xecfg); > } Thanks Junio. But wait, where does this patch go? Before or after 27af01d? If I'm understanding the situation correctly, this patch won't change the reporting 10/9 for --numstat, no? Anyway, this patch looks right. Acked-by: Tay Ray Chuan <rctay89@xxxxxxxxx> Interesting to find that we have many xdiff users that don't respect diff options on the command line (or may not acess to them), like patch-id, merge. I wonder if there would be less conflicts if merge's ctxlen could be overriden... -- Cheers, Ray Chuan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html