Re: Git diff misattributes the first word of a line to the previous line

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Gurjeet,

On 13/10/2022 07:45, Johannes Sixt wrote:
> Am 13.10.22 um 07:51 schrieb Gurjeet Singh:
>> Git diff seems to get confused about word boundaries, and includes the
>> first word from the next line.
> No, that would misattribute the perceived malfunction.
>
>> It seems that the first word of a line gets attributed to the previous
>> line, ignoring the fact that there's an intervening newline before the
>> word.

Given that this effect is a part of the design (LF => whitespace), are
there any changes to the *documentation* that could be made to help
clarify this? E.g. looking back (you did check the manual? ;-) did you
miss some aspect in the man pages that could have been more prominent,
placed earlier, or clarified?

Why is this way of reporting even expected (e.g. confusion between
flowed text, and line oriented code, without a mode change), etc. ?

Any other `retrospective` thoughts that could help?

--
Philip

>> [...]
>> $ git diff --word-diff=plain /tmp/1.txt /tmp/2.txt
>> diff --git a/tmp/1.txt b/tmp/2.txt
>> index 8239f93..099fb80 100644
>> --- a/tmp/1.txt
>> +++ b/tmp/2.txt
>> @@ -1,2 +1,2 @@
>>     x = yz [-ab-]{+opt1+}
>> {+    ac+} = [-cd ef-]{+pq opt2+}
>>
>> $ cat /tmp/1.txt
>>     x = yz
>>     ab = cd ef
>>
>> $ cat /tmp/2.txt
>>     x = yz opt1
>>     ac = pq opt2
> The reason for this is that the implementation of word-diff does not
> treat newline characters in any special way. They are treated as
> "whitespace" like any other character that is not captured by the
> word-diff patterns. Whitespace characters following each word are
> recorded, but are disregarded when the word-diff is computed. When the
> text is reconstructed in the output, these recorded space characters are
> printed only for unchanged and added words, but are not printed for
> removed words (IIRC). Combine this with the fact that when there is a
> change, i.e., a combination of removal and addition, then the removal is
> printed before the addition, and you get the observed output.
>
> I don't see an easy solution for this without completely rewriting the
> implementation.
>
> -- Hannes
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux