Re: git-apply: warn/fail on changed end of line (eol) only?

Igor Djordjevic BugA <igor.d.djordjevic@xxxxxxxxx> · Tue, 27 Dec 2016 01:26:55 +0100

Hi Junio, and thanks for sharing your thoughts.

You understood it pretty well, except the context "scope". Example does
show a single line change, and a single line old/new comparison, but
that is for simplicity sake of the initial example. What I was
discussing about was the whole patch "hunk" instead (mentioned inside
the part you quoted, too).

> But what if the patch was like this?
> 
>      1 <CRLF>
>     -3 <CRLF>
>     +three <CRLF>
>     +four <CRLF>
>      5 <CRLF>
> 
> Do you want to warn on "four <CRLF>" because it does not have any
> "previous" corresponding line?

No, because the old hunk (single - line) has <CRLF> line ending(s), the
same as the new hunk (both + lines), so effectively there is no end of
line change between old and new _hunks_, so no warning needed.

> Extending the thought further, which line do you want to warn and
> which line do you not want to, if the patch were like this instead?
> 
>      1 <CRLF>
>     -3 <CRLF>
>     +four <CRLF>
>     +three <CRLF>
>      5 <CRLF>

Again, no warnings here, same as described above.

> Extending this thought experiment further, you would realize that
> fundamentally the concept of "previous contents" has no sensible
> definition.

This one is interesting, and maybe arguable - I certainly don`t have
enough knowledge about the overall matter (Git heuristics for diff
creation, in terms of which parts get marked as "old" (-), and which as
"new" (+), which probably even depends on user defined settings)...

... BUT, the overall idea here seems rather simple (to me :P) - if *all*
removed lines inside the old hunk have the same line endings (<CRLF>,
for example), where there is *any* (at least one) added line inside the
new hunk that has a different line ending than the ones in the old hunk
(like <LF>), then warning or failure might be preferred (optionally,
at least), as it does seem like something fishy is going on.

I appreciate and understand a need for Git being able to know "correct"
line endings, but in case where the whole previous hunk (or the whole
file, even) has "incorrect" line endings (<CRLF>, as far as Git is
concerned, as in the given example), it seems more sensible to me for
Git to warn you about _that_ first, rather than keep silent and apply a
new hunk introducing <LF> line endings without you even knowing - heck,
maybe your core end of line setting is incorrect, indeed, so actually
having someone to let you know seems nice, before potentially corrupting
your file.

It felt like - "Hey, I do have a mean to know that your _whole file_
has incorrect line endings (as per my current settings), but I don`t
want to tell you, yet I`ll rather happily apply this patch which
introduces n lines with _correct_ line endings... So, now you still have
your whole incorrect file, but don`t worry, I fixed those n lines for
you behind your back, life is good, and you`re welcome."

Ok, I might be exaggerating, but it just doesn`t seem right, the end
result seems even worse - having "incorrect" and "correct" endings
silently mixed.

To prevent this, the most important question seems to be - do we have a
sensible way to tell "this is THE line ending of the old hunk (the only
line ending used)"?

And the worst case answer would probably be - you need to check the
whole (old) file, if all line endings are the same, that is THE line
ending you`ll use for comparison against line endings of the newly added
lines.

For the best case scenario, I don`t know enough of Git diff heuristics
to say if we would even be able to determine THE line endings based on
the old _hunk_ only, not to examine the whole old file.

_If_ we can find THE old hunk (or file) line endings, and patch
introduces different ones, then warn/fail option seems sensible...?

Of course, in case where old hunk/file already has mixed line endings
(like both <CRLF> and <LF>), we can either choose to warn right away (as
mixed lines endings are found already), or maybe even not warn at all,
as nothing actually changes in regards to line endings no matter which
one gets added, and we`re discussing "warn/fail on eol *change* only".

Otherwise, I agree about "previous contents" sensibility, especially in
the "blame" case, but this does seem a bit different... or does it?

Regards, BugA

Re: git-apply: warn/fail on *changed* end of line (eol) *only*?

Re: git-apply: warn/fail on changed end of line (eol) only?