On Sat, Nov 19, 2022 at 06:50:28PM +0100, René Scharfe wrote: > Am 19.11.2022 um 09:18 schrieb Johannes Sixt: > > > > The reason that mingw_test_cmp exists is not that Git isn't ported > > correctly, or that tests aren't ported correctly. The reason is that > > tests assume Unix LF line endings everywhere, but there are some tools > > that are outside our control that randomly -- to the layman's eye -- > > produce CRLF line endings even when their input has LF style. > > > > For example, when we post-process Git output with `sed`, the result > > suddenly has CRLF line endings instead of LF that the input had. > > Actually I see the opposite behavior -- sed eats CRs on an up-to-date > Git for Windows SDK: > > $ uname -s > MINGW64_NT-10.0-22621 > > $ printf 'a\r\n' | hexdump.exe -C > 00000000 61 0d 0a |a..| > 00000003 > > $ printf 'a\r\n' | sed '' | hexdump.exe -C > 00000000 61 0a |a.| > 00000002 There is a "-b" option for sed under MINGW; -b, --binary open files in binary mode (CR+LFs are not processed specially) The CRLF handling for sed (and probably grep and awk) had beed changed in cygwin https://cygwin.com/pipermail/cygwin/2017-June/233133.html (And I suspect that this rippled into MINGW some day) > > And with the following patch on top of eea7033409 (The twelfth batch, > 2022-11-14) the test suite passes for me -- just one case of grep > stealing CRs seems to need adjustment to make mingw_test_cmp > unnecessary: > > t/t3920-crlf-messages.sh | 2 +- > t/test-lib.sh | 1 - > 2 files changed, 1 insertion(+), 2 deletions(-) > > diff --git a/t/t3920-crlf-messages.sh b/t/t3920-crlf-messages.sh > index 4c661d4d54..353b1a550e 100755 > --- a/t/t3920-crlf-messages.sh > +++ b/t/t3920-crlf-messages.sh > @@ -12,7 +12,7 @@ create_crlf_ref () { > cat >.crlf-orig-$branch.txt && > cat .crlf-orig-$branch.txt | append_cr >.crlf-message-$branch.txt && > grep 'Subject' .crlf-orig-$branch.txt | tr '\n' ' ' | sed 's/[ ]*$//' | tr -d '\n' >.crlf-subject-$branch.txt && > - grep 'Body' .crlf-message-$branch.txt >.crlf-body-$branch.txt || true && > + grep 'Body' .crlf-orig-$branch.txt | append_cr >.crlf-body-$branch.txt || true && Talking about grep: Both grep under Linux, MacOs and MINGW (git bash) seem to have the -U option: -U, --binary do not strip CR characters at EOL (MSDOS/Windows)