Re: [PATCH v3 2/2] tests(mingw): avoid very slow `mingw_test_cmp`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 19, 2022 at 06:50:28PM +0100, René Scharfe wrote:
> Am 19.11.2022 um 09:18 schrieb Johannes Sixt:
> >
> > The reason that mingw_test_cmp exists is not that Git isn't ported
> > correctly, or that tests aren't ported correctly. The reason is that
> > tests assume Unix LF line endings everywhere, but there are some tools
> > that are outside our control that randomly -- to the layman's eye --
> > produce CRLF line endings even when their input has LF style.
> >
> > For example, when we post-process Git output with `sed`, the result
> > suddenly has CRLF line endings instead of LF that the input had.
>
> Actually I see the opposite behavior -- sed eats CRs on an up-to-date
> Git for Windows SDK:
>
>    $ uname -s
>    MINGW64_NT-10.0-22621
>
>    $ printf 'a\r\n' | hexdump.exe -C
>    00000000  61 0d 0a                                          |a..|
>    00000003
>
>    $ printf 'a\r\n' | sed '' | hexdump.exe -C
>    00000000  61 0a                                             |a.|
>    00000002


There is a "-b" option for sed under MINGW;
 -b, --binary
                  open files in binary mode (CR+LFs are not processed specially)

The CRLF handling for sed (and probably grep and awk) had beed changed in cygwin
https://cygwin.com/pipermail/cygwin/2017-June/233133.html

(And I suspect that this rippled into MINGW some day)


>
> And with the following patch on top of eea7033409 (The twelfth batch,
> 2022-11-14) the test suite passes for me -- just one case of grep
> stealing CRs seems to need adjustment to make mingw_test_cmp
> unnecessary:
>
>  t/t3920-crlf-messages.sh | 2 +-
>  t/test-lib.sh            | 1 -
>  2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/t/t3920-crlf-messages.sh b/t/t3920-crlf-messages.sh
> index 4c661d4d54..353b1a550e 100755
> --- a/t/t3920-crlf-messages.sh
> +++ b/t/t3920-crlf-messages.sh
> @@ -12,7 +12,7 @@ create_crlf_ref () {
>  	cat >.crlf-orig-$branch.txt &&
>  	cat .crlf-orig-$branch.txt | append_cr >.crlf-message-$branch.txt &&
>  	grep 'Subject' .crlf-orig-$branch.txt | tr '\n' ' ' | sed 's/[ ]*$//' | tr -d '\n' >.crlf-subject-$branch.txt &&
> -	grep 'Body' .crlf-message-$branch.txt >.crlf-body-$branch.txt || true &&
> +	grep 'Body' .crlf-orig-$branch.txt | append_cr >.crlf-body-$branch.txt || true &&


Talking about grep:
Both grep under Linux, MacOs and MINGW (git bash) seem to have the -U option:
  -U, --binary              do not strip CR characters at EOL (MSDOS/Windows)





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux