Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > This assumes that the platform is either CR/LF or LF. That is incorrect. > Windows does *not* dictate the line endings to be CR/LF. Certain > applications do. > > The shell is an MSys2 shell, and MSys2 convention is to use LF. Hence the > shell is done correctly for the platform. > > Yet on Windows (and in cross-platform projects also on Linux and MacOSX), > you have to deal with the fact that text files produced outside the cozy > little shell can have different line endings. I know that. That is why I said this is platform specific. When somebody on Ubuntu comes and says "XYZ does not work when I saved my text file with CRLF", I can say "don't do that then". I've been assuming that you cannot say the same to your users, in order to be nicer to Windows. You say MSys2 convention is to use LF, but what do you exactly mean by that? I am hoping that you do not mean "we treat a file created by a DOS editor as a text file with CR appended at the end of each line". I'd instead expect that a file the user wrote ABC on the first line will give you ABC in $word if the user did this: #!/bin/sh read word <file and not "ABC^M". That is, "we may produce only LF files, we consume LF files just fine, but we also treat CRLF files as text as their users intended." > As I said, I am uninterested in arguing for arguing's sake. I already explained why this particular thing does not have anything to do with IFS but comes from the way your "read" behaves, which is not in line with the way users on your platform expects. Is that so hard to understand? An argument you make to support adding stuff to IFS _is_ an argument for arguing's sake. > What I really needed you to do indicate the way forward. Yes, and I already did, didn't I? I already said that mucking with IFS is a wrong thing to do, and indicated that that is not the way forward. I also already said that fixing your read is the only sensible thing to do in the longer term iff you claim to support those who use "certain applications" that leave CRLF line terminator. * http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html defines the operation as a two-step process. It is supposed to read an input line and then "The terminating <newline> (if any) shall be removed from the input". And then "the result shall be split into fields" (at IFS boundaries). As POSIX does not say <newline> has to be a single octet with value 10 (the same way you are allowed to show CRLF when you call printf("\n"), and you do not see CR left at the end when you use gets(3)). So on a platform that (optionally) accepts CRLF as a line terminator, "read" should read up to LF for a line, and if there is a CR immediately before that LF, remove that CR, too. Now you got the result of "terminating <newline> (if any) shall be removed". Then look at "the result" and split into fields using IFS. That way, you still process LF terminated files correctly, and you would never show a single ^M for an empty line, which triggered this thread. * Optionally, you may want to implement the "field splitting" (at IFS boundaries) with a small twist. When the script/user included LF in IFS, split at LF but remove CR if there is one that immediately precedes the LF. And also I gave you a more localized (i.e. less likely to negatively affect unrelated parts of the system) workaround that we could use, until your "read" is fixed. Mind you, I am not happy with that "$cr may appear at the end of the last token on the line" workaround. Other places in the code (both inside and outside Git scripted porcelains) that calls "read" to read from text files can suffer from the same issue, until it is fixed. Covering up the bug in "read" by throwing extra characters to IFS, especially when read is not the only user of IFS, is not a fix. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html