Re: [PATCH 0/5] Abide by our own rules regarding line endings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

On Wed, 3 May 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@xxxxxx> writes:
> 
> > For starters, those files include shell scripts: the most prevalent
> > shell interpreter in use (and certainly used in Git for Windows) is
> > Bash, and Bash does not handle CR/LF line endings gracefully.
> 
> Good to know.  I am not sure if it is OK for shell scripts not to honor
> the platform convention, though.

Well, a couple of comments about your comment:

- we say "shell scripts", but we're sloppy there: they are "Unix shell
  scripts", as they are executed by Unix shells. As such, it is pretty
  obvious that they favor Unix line endings, right? And that they do not
  really handle anything else well, right?

- You try to say that it is not okay for shell scripts to be checked
  out as LF-only when the platform convention for *text* files is CR/LF,
  right? Please note that if you follow through on this thought, you are
  very close to recommending to render shell scripts dysfunctional by
  checking them out with CR/LF endings.

That latter point, to recommend to break shell scripts, is something I
really fail to understand...

> Stated from the opposite angle, I would not be surprised if your
> shell scripts do not work on Linux if you set core.autocrlf to true.
> Git may honor it, but shells on Linux (or BSD for that matter) do
> not pay attention to core.autocrlf and they are within their rights
> to complain on an extra CR at the end of the line.  IOW, I would
> doubt that it should be our goal to set core.autocrlf on a platform
> whose native line endings is LF and make the tests to pass.

See? I *knew* it was a mistake to follow Jonathan's recommendation to make
this "you can reproduce this even on Linux" comment part of the commit
message.

I *never* asked to make core.autocrlf=true the default on Linux.

All I did was to point out that you do not need Windows to reproduce the
problems.

That is really a far cry from trying to convince anybody that it makes
sense to require Git to pass the build & tests with core.autocrlf=true *on
Linux*.

I want to make it pass on Windows, yes, and I do not want to force anybody
with a Linux setup to get a (free) Windows VM to test this. I want it to
pass on Windows, and to make it easier for you Linux-only folks, I tried
to give you a way to start validating my claims that core.autocrlf=true
was introduced by Git without even bothering to let Git itself build and
pass the test suite with that setting.

> > Related to shell scripts: when generating common-cmds.h, we use tools
> > that generally operate on the assumption that input and output
> > deliminate their lines using LF-only line endings. Consequently, they
> > would happily copy the CR byte verbatim into the strings in
> > common-cmds.h, which in turn makes the C preprocessor barf (that
> > interprets them as MacOS-style line endings).
> 
> This indeed is a problem.  "add\r" is not a name of a common
> command, obviously,

Please note that it is not "add\r" that is part of the common-cmds.h file
as generated by current git.git's `master` with core.autocrlf=true. I.e.
it is not the sequence containing a backslash followed by an `r`.

It is actually "add<CR>", which the GNU C preprocessor interprets as a
line break in the middle of the string constant (most likely for
backwards-compatibility with MacOS, where line breaks were indicated by
Carriage Returns *without* Line Feeds).

> regardless of how the text file that lists the names of the commands is
> encoded.  I am undecided if it is a problem in the source text (i.e.
> command-list.txt is not a platform neutral "text" but has to be encoded
> with LF endings) or the bug in the tools used in the generate-cmdlist.sh
> script, though.  Shouldn't the tools be aware of the platform convention
> of what text files are and how their eol looks like?

I wonder why we spend so much time on discussing this issue, really.

Clearly, command-list.txt is *intended* as input for scripts. We do not
ship the file verbatim to the end user, we only pass it through sed to
generate common-cmds.h, we pass it through sed to verify the completeness
of the docs, and we pass it through the Perl script
Documentation/cmd-list.perl to generate certain command lists intended for
inclusion in the man page.

In all cases, we expect to feed the contents of this file to Unix shell
scripts and/or Unix tools.

Is it so unobvious that the input should be crafted to fulfill that role
as best as it can by catering to Unix tools?

And if you truly think that we should use different tools based on the
platform, then you will have to swallow the rather large pill that Git's
own very heavy use of Unix shell scripting was a big, big mistake from the
beginning.

I doubt you are ready to accept that yet...

Ciao,
Dscho



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]