Re: Git diff misattributes the first word of a line to the previous line

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 13.10.22 um 07:51 schrieb Gurjeet Singh:
> Git diff seems to get confused about word boundaries, and includes the
> first word from the next line.

No, that would misattribute the perceived malfunction.

> It seems that the first word of a line gets attributed to the previous
> line, ignoring the fact that there's an intervening newline before the
> word.
> [...]
> $ git diff --word-diff=plain /tmp/1.txt /tmp/2.txt
> diff --git a/tmp/1.txt b/tmp/2.txt
> index 8239f93..099fb80 100644
> --- a/tmp/1.txt
> +++ b/tmp/2.txt
> @@ -1,2 +1,2 @@
>     x = yz [-ab-]{+opt1+}
> {+    ac+} = [-cd ef-]{+pq opt2+}
> 
> $ cat /tmp/1.txt
>     x = yz
>     ab = cd ef
> 
> $ cat /tmp/2.txt
>     x = yz opt1
>     ac = pq opt2

The reason for this is that the implementation of word-diff does not
treat newline characters in any special way. They are treated as
"whitespace" like any other character that is not captured by the
word-diff patterns. Whitespace characters following each word are
recorded, but are disregarded when the word-diff is computed. When the
text is reconstructed in the output, these recorded space characters are
printed only for unchanged and added words, but are not printed for
removed words (IIRC). Combine this with the fact that when there is a
change, i.e., a combination of removal and addition, then the removal is
printed before the addition, and you get the observed output.

I don't see an easy solution for this without completely rewriting the
implementation.

-- Hannes




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux