Hi guys, While I din't have the experience to express an opinion on this matter, I have to say that the --no-indent-heuristic that Jeff suggested worked great. There were more than a handful of cases that this issue happened in my diff file (all were the same: #endif followed by #ifdef). Oh, and the language is C indeed. Thanks for the great support! On Tue, Feb 13, 2018 at 8:11 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote: > On Tue, Feb 13, 2018 at 8:08 AM, Jeff King <peff@xxxxxxxx> wrote: >> On Tue, Feb 13, 2018 at 05:06:15PM +0200, Σπύρος Βαζαίος wrote: >> >>> Hi, I've came across an issue when using the git diff command. In >>> particular the diff is different to what the svn diff produces. While >>> both being correct the output of the svn diff is easier to understand >>> than the git diff one. See the following issue on github where I >>> initially reported the issue: >>> >>> https://github.com/git-for-windows/git/issues/1494 >>> >>> I have Included a picture to better illustrate the problem. What do >>> you think? Is it possible to make git diff output similar to svn diff >>> regarding this issue? >> >> Try "git diff --no-indent-heuristic", which makes your example look >> better. Here's a quick reproduction: >> >> -- >8 -- >> cat >foo.c <<\EOF >> static struct foo bar[] = { >> #ifdef SOMETHING >> { "stats.info", MNU_GBX_FSTAINF, etc }, >> { "expired.info", MNU_GBX_FSTAINF, etc }, >> { "info.log", MNU_GBX_INFOLOG, etc }, >> #endif >> { NULL, 0, 0 }, >> }; >> EOF >> >> sed '6a\ >> #ifdef WITH_EMU\ >> { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\ >> #endif >> ' <foo.c >bar.c >> -- 8< -- >> >> Now this looks ugly: >> >> git diff --no-index foo.c bar.c >> >> but this does not: >> >> git diff --no-index --no-indent-heuristic foo.c bar.c >> >> That heuristic is described in 433860f3d0 (diff: improve positioning of >> add/delete blocks in diffs, 2016-09-05). I'm not sure exactly why it >> does the wrong thing here, or if it would be possible to tweak the >> weighting factors to make it work. > > https://github.com/git/git/commit/433860f3d0 > > I would think that the long lines at the shift boundaries[1] have > bad impact on the algorithm as they are of different length > and influence the backstepping value. (vague theory) > > I wonder if we want to add language specifics to the heuristic, > such that '}' or 'end' in a given line have a strong signal that the > cut should be right after that line? i.e. that would decrease the > badness score. > > While this is not a general solution (but quite language specific), > I would think this is still a good idea to look at. > (When coming up with the heuristic, most people thought the bad > examples would come from exotic languages, but not C, which we > are all so familiar with such that we thought we had a good grasp > how to make a heuristic that works with C. Note that the example > looks like C, though). > > [1] > { "info.log", MNU_GBX_INFOLOG, etc }, > vs > { "SoftCam.Key", MNU_CFG_FSOFTCAMKEY, etc },\ > > > (slightly unrelated tangent on better diffs in general:) > A couple days ago I had another discussion on how > diffs are best presented to users. That discussion was > focused on a slightly higher layer, the diff algorithm itself, > instead of a post-algorithm boost-fixup. > > It is usually easier to both write as well as review code that > adds completely new features or projects instead of squashing > bugs or integrating subtle new features in an existing code base. > > This observation lead to the conclusion that reviewing large > locks of adjacent new lines is easier than reviewing blocks > where deletions as well as additions are found. So we theorized > that deletions should have more impact than additions when > computing the diff itself. The case presented here has no deletions > so this is not applicable, but most examples that came up once > 2477ab2ea8 (diff: support anchoring line(s), 2017-11-27) > was discussed would be likely to have better diffs. > > Stefan