Re: Bug: git log --numstat counts wrong

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 23, 2011 at 1:51 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Junio C Hamano <gitster@xxxxxxxxx> writes:
>
>> Alexander Pepper <pepper@xxxxxxxxxxxxxxxx> writes:
>>
>>> Am 21.09.2011 um 14:24 schrieb Junio C Hamano:
>>>>> $ git log --numstat 48a07e7e533f507228e8d1c99d4d48e175e14260
>>>>> [...]
>>>>> 11      10      src/java/voldemort/server/storage/StorageService.java
>>>>
>>>> Didn't we update it this already? I seem to get 10/9 here not 11/10.
>>>
>>> Current 'maint' (cd2b8ae9), 'master' (4b5eac7f)...
>>
>> That's a tad old master you seem to have.
>>
>> Strangely, bisection points at 27af01d5523, which was supposed to be only
>> about performance and never about correctness. There is something fishy
>> going on....
>
> In any case, I think the real issue is that depending on how much context
> you ask, the resulting diff is different (and both are valid diffs). If
> you ask "log -p" (or "diff" or "show") to produce a patch, then we use the
> default 3-line context. And then you feed that to an external diffstat to
> count the number of deleted and added lines to get one set of numbers.
>
> The --numstat (and --diffstat) code seems to be running the internal diff
> machinery with 0-line context and counting the resulting diff internally.
>
> And of course the results between the above two would be different because
> diff can match lines differently when given different number of context
> lines to include in the result.
>
> So perhaps a good sanity-check for you to try (note: not checking your
> sanity, but checking the sanity of the above analysis) would be to do:
>
>  $ git show 48a07e7e53 -- $that_path | diffstat
>  $ git show -U0 48a07e7e53 -- $that_path | diffstat
>  $ git show --numstat 48a07e7e53 -- $that_path
>  $ git show --stat 48a07e7e53 -- $that_path
>
> and see how they compare (make sure to use the same version of git for
> these experiments). The first one uses the default 3-lines context, the
> second one forces 0-line context, and the last two uses 0-line context
> hardwired in the code.
>
> Applying the following patch should make the last two use the default
> context or -U$num given from the command line to be consistent with the
> codepath where we generate textual patches.
>
>  diff.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 9038f19..302ef33 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2251,6 +2251,8 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
>                memset(&xpp, 0, sizeof(xpp));
>                memset(&xecfg, 0, sizeof(xecfg));
>                xpp.flags = o->xdl_opts;
> +               xecfg.ctxlen = o->context;
> +               xecfg.interhunkctxlen = o->interhunkcontext;
>                xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
>                              &xpp, &xecfg);
>        }

Thanks Junio.

But wait, where does this patch go? Before or after 27af01d? If I'm
understanding the situation correctly, this patch won't change the
reporting 10/9 for --numstat, no?

Anyway, this patch looks right.

  Acked-by: Tay Ray Chuan <rctay89@xxxxxxxxx>

Interesting to find that we have many xdiff users that don't respect
diff options on the command line (or may not acess to them), like
patch-id, merge. I wonder if there would be less conflicts if merge's
ctxlen could be overriden...

-- 
Cheers,
Ray Chuan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]