Re: [PATCH v4 1/2] ref-filter: handle CRLF at end-of-line more gracefully

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

> Le 22 oct. 2020 à 20:52, Junio C Hamano <gitster@xxxxxxxxx> a écrit :
> 
> "Philippe Blain via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
>> From: Philippe Blain <levraiphilippeblain@xxxxxxxxx>
>> 
>> The ref-filter code does not correctly handle commit or tag messages
>> that use CRLF as the line terminator. Such messages can be created with
>> the `--cleanup=verbatim` option of `git commit` and `git tag`, or by
>> using `git commit-tree` directly.
>> 
>> The function `find_subpos` in ref-filter.c looks for two consecutive
>> LFs to find the end of the subject line, a sequence which is absent in
>> messages using CRLF. This results in the whole message being parsed as
>> the subject line (`%(contents:subject)`), and the body of the message
>> (`%(contents:body)`) being empty.
>> 
>> Moreover, in `copy_subject`, which wants to return the subject as a
>> single line, '\n' is replaced by space, but '\r' is
>> untouched.
> 
> Honestly, all of the above signal, at least to me, that these
> objects are designed to use LF terminated lines and nothing else,
> whether Windows or DOS existed in the same world or not.  There is
> no such thing as commit objects that use CRLF as the line
> terminator.  They are commit objects whose payload has CR at the end
> of each and every line.  Just like there can be commit objects whose
> payload has trailing SP on each line, or even has binary guck, these
> things can be created using the "commit --cleanup=verbatim" command,
> or the "hash-objects" command.  It does not mean it is encouraged to
> create such objects.  It does not mean it is sensible to expect them
> to behave as if these trailing whitespaces (be it SP or CR) are not
> there.
> 
>> This impacts the output of `git branch`, `git tag` and `git
>> for-each-ref`.
> 
> The answer to that problem description is "then don't" ;-).  If you
> do not want to have trailing whitespaces, you need to clean them up
> somehow, and we give an easy way to do so with the default --cleanup
> action.  Setting it to verbatim is to decline that easy way offered
> to you, and it makes it your responsibility to do your own clean-up
> if you still want to remove the CR at the end of your lines.

I agree with you on that : if you are creating the object yourself,
you should let the default cleanup take place.

But as a lot of projects use GitHub, GitLab or similar services
to accept contributions, and let these web systems perform the "merge" 
(or rebase or whatever) operation to integrate these contributions;
maintainers sometime choose to not always have complete control
on all objects that become part of the canonical history of their repository. 
And as I wrote in [1], GitLab was creating commits using CRLF up until 9.2... [2].
So for these poor projects that are now stuck with these CRLFs in their
merge commit messages, I think it's good that Git handles these correctly.

> Having said all that.
> 
> Here is how I explained the topic in the "What's cooking" report.
> 
>     A commit and tag object may have CR at the end of each and
>     every line (you can create such an object with hash-object or
>     using --cleanup=verbatim to decline the default clean-up
>     action), but it would make it impossible to have a blank line
>     to separate the title from the body of the message.  Be lenient
>     and accept a line with lone CR on it as a blank line, too.

Just for the sake of searchability, I think it would be good to have 
CRLF spelled out in this topic description (since I gather this is what 
ends up in the release notes). But I don't feel that strongly
about that.

> Let's not call this change a "bug fix".  The phrase you used in your
> title, "more gracefully", is a very good one.

It was your suggestion ;) 

> In the meantime, I've squashed your "oops forgot ||return 1" change
> into [PATCH 2/2].

Thanks for squashing it in.

Cheers,
Philippe.


[1] https://lore.kernel.org/git/63755050-10A5-4A46-9BB3-8207E055692C@xxxxxxxxx/
[2] https://gitlab.com/gitlab-org/gitlab-foss/-/issues/31671



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux