Re: make git diff output easier to read - use better diff heuristics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi guys,

While I din't have the experience to express an opinion on this
matter, I have to say that the --no-indent-heuristic that Jeff
suggested worked great.
There were more than a handful of cases that this issue happened in my
diff file (all were the same: #endif followed by #ifdef).
Oh, and the language is C indeed.

Thanks for the great support!

On Tue, Feb 13, 2018 at 8:11 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
> On Tue, Feb 13, 2018 at 8:08 AM, Jeff King <peff@xxxxxxxx> wrote:
>> On Tue, Feb 13, 2018 at 05:06:15PM +0200, Σπύρος Βαζαίος wrote:
>>
>>> Hi, I've came across an issue when using the git diff command. In
>>> particular the diff is different to what the svn diff produces. While
>>> both being correct the output of the svn diff is easier to understand
>>> than the git diff one. See the following issue on github where I
>>> initially reported the issue:
>>>
>>> https://github.com/git-for-windows/git/issues/1494
>>>
>>> I have Included a picture to better illustrate the problem. What do
>>> you think? Is it possible to make git diff output similar to svn diff
>>> regarding this issue?
>>
>> Try "git diff --no-indent-heuristic", which makes your example look
>> better. Here's a quick reproduction:
>>
>> -- >8 --
>> cat >foo.c <<\EOF
>> static struct foo bar[] = {
>> #ifdef SOMETHING
>>         { "stats.info", MNU_GBX_FSTAINF, etc },
>>         { "expired.info", MNU_GBX_FSTAINF, etc },
>>         { "info.log", MNU_GBX_INFOLOG, etc },
>> #endif
>>         { NULL, 0, 0 },
>> };
>> EOF
>>
>> sed '6a\
>> #ifdef WITH_EMU\
>>         { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\
>> #endif
>> ' <foo.c >bar.c
>> -- 8< --
>>
>> Now this looks ugly:
>>
>>   git diff --no-index foo.c bar.c
>>
>> but this does not:
>>
>>   git diff --no-index --no-indent-heuristic foo.c bar.c
>>
>> That heuristic is described in 433860f3d0 (diff: improve positioning of
>> add/delete blocks in diffs, 2016-09-05). I'm not sure exactly why it
>> does the wrong thing here, or if it would be possible to tweak the
>> weighting factors to make it work.
>
> https://github.com/git/git/commit/433860f3d0
>
> I would think that the long lines at the shift boundaries[1] have
> bad impact on the algorithm as they are of different length
> and influence the backstepping value. (vague theory)
>
> I wonder if we want to add language specifics to the heuristic,
> such that '}' or 'end' in a given line have a strong signal that the
> cut should be right after that line? i.e. that would decrease the
> badness score.
>
> While this is not a general solution (but quite language specific),
> I would think this is still a good idea to look at.
> (When coming up with the heuristic, most people thought the bad
> examples would come from exotic languages, but not C, which we
> are all so familiar with such that we thought we had a good grasp
> how to make a heuristic that works with C. Note that the example
> looks like C, though).
>
> [1]
> { "info.log", MNU_GBX_INFOLOG, etc },
> vs
> { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\
>
>
> (slightly unrelated tangent on better diffs in general:)
> A couple days ago I had another discussion on how
> diffs are best presented to users. That discussion was
> focused on a slightly higher layer, the diff algorithm itself,
> instead of a post-algorithm boost-fixup.
>
> It is usually easier to both write as well as review code that
> adds completely new features or projects instead of squashing
> bugs or integrating subtle new features in an existing code base.
>
> This observation lead to the conclusion that reviewing large
> locks of adjacent new lines is easier than reviewing blocks
> where deletions as well as additions are found. So we theorized
> that deletions should have more impact than additions when
> computing the diff itself. The case presented here has no deletions
> so this is not applicable, but most examples that came up once
> 2477ab2ea8 (diff: support anchoring line(s), 2017-11-27)
> was discussed would be likely to have better diffs.
>
> Stefan




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux