Re: Bug: git log --numstat counts wrong

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Alexander Pepper <pepper@xxxxxxxxxxxxxxxx> writes:
>
>> Am 21.09.2011 um 14:24 schrieb Junio C Hamano:
>>>> $ git log --numstat 48a07e7e533f507228e8d1c99d4d48e175e14260
>>>> [...]
>>>> 11      10      src/java/voldemort/server/storage/StorageService.java
>>> 
>>> Didn't we update it this already? I seem to get 10/9 here not 11/10.
>>
>> Current 'maint' (cd2b8ae9), 'master' (4b5eac7f)...
>
> That's a tad old master you seem to have.
>
> Strangely, bisection points at 27af01d5523, which was supposed to be only
> about performance and never about correctness. There is something fishy
> going on....

In any case, I think the real issue is that depending on how much context
you ask, the resulting diff is different (and both are valid diffs). If
you ask "log -p" (or "diff" or "show") to produce a patch, then we use the
default 3-line context. And then you feed that to an external diffstat to
count the number of deleted and added lines to get one set of numbers.

The --numstat (and --diffstat) code seems to be running the internal diff
machinery with 0-line context and counting the resulting diff internally.

And of course the results between the above two would be different because
diff can match lines differently when given different number of context
lines to include in the result.

So perhaps a good sanity-check for you to try (note: not checking your
sanity, but checking the sanity of the above analysis) would be to do:

  $ git show 48a07e7e53 -- $that_path | diffstat
  $ git show -U0 48a07e7e53 -- $that_path | diffstat
  $ git show --numstat 48a07e7e53 -- $that_path
  $ git show --stat 48a07e7e53 -- $that_path

and see how they compare (make sure to use the same version of git for
these experiments). The first one uses the default 3-lines context, the
second one forces 0-line context, and the last two uses 0-line context
hardwired in the code.

Applying the following patch should make the last two use the default
context or -U$num given from the command line to be consistent with the
codepath where we generate textual patches.

 diff.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/diff.c b/diff.c
index 9038f19..302ef33 100644
--- a/diff.c
+++ b/diff.c
@@ -2251,6 +2251,8 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		memset(&xpp, 0, sizeof(xpp));
 		memset(&xecfg, 0, sizeof(xecfg));
 		xpp.flags = o->xdl_opts;
+		xecfg.ctxlen = o->context;
+		xecfg.interhunkctxlen = o->interhunkcontext;
 		xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
 			      &xpp, &xecfg);
 	}


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]