When the 'grep with invalid UTF-8 data' tests were added/adjusted in 8a5999838e (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26) and 870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching, 2019-07-26) they lacked a redirect which caused them to falsely succeed on most architectures. They failed on big-endian arches where the test never reached the portion which was missing the redirect. A recent patch add the missing redirect and exposed the fact that the 'PCRE v2: grep non-ASCII from invalid UTF-8 data with -i' test fails on all architectures. Based on the final paragraph in in 870eea8166: When grepping a non-ASCII fixed string. This is a more general problem that's hard to fix, but we can at least fix the most common case of grepping for a fixed string without "-i". I can't think of a reason for why we'd turn on PCRE2_UTF when matching byte-for-byte like that. it seems that we don't expect that the case-insensitive grep will succeed. Adjust the test to reflect that expectation. Signed-off-by: Todd Zullinger <tmz@xxxxxxxxx> --- Hi, Andreas Schwab wrote: > Two tests in t7812 were missing redirects, failing to actually test the > produced output. > > Fixes: 8a5999838e ("grep: stess test PCRE v2 on invalid UTF-8 data") > Signed-off-by: Andreas Schwab <schwab@xxxxxxxxxxxxxx> Nice catch on these missing redirects. While testing the 2.24.0 rc's this test failed only on big-endian arches (like s390x in Fedora). It was briefly mentioned at the time in <20191020002648.GZ10893@xxxxxxxxx>. After applying the fix to add the missing redirects, the test now fails on all architectures. It seems like that is the intended state and we simply need to adjust the final test with `grep -i` accordingly. Of course, it's possible that I've misunderstood Ævar's intentions in 870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching, 2019-07-26) or otherwise have poorly explained the reasoning in my commit message. We don't appear to test PCRE (neither v1 nor v2) in our CI builds. I added libpcre2-dev and set USE_LIBPCRE to test this in travis. (Whether we want to enable that by default is worth discussing.) Doing so confirms that the tests (incorrectly) pass without the missing redirects and fail when the redirects are added. Inverting the expectation of test_cmp in the final `grep -i` test results in a successful test run. with PCRE2: https://travis-ci.org/tmzullinger/git/jobs/618733182 with PCRE2 and 'add missing redirects': https://travis-ci.org/tmzullinger/git/jobs/618723277 with PCRE2, 'add missing redirects' and this patch: https://travis-ci.org/tmzullinger/git/jobs/618796963 This still doesn't fix the failure on s390x. There test_might_fail git grep -hi "Æ" invalid-0x80 >actual fails and leaves nothing in 'actual' which causes the test to fail at the following test_cmp expected actual line. If we suspect the test might fail, it seems like we need to account for that and not expect that 'expect' will match 'actual' as we currently do. t/t7812-grep-icase-non-ascii.sh | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh index c4528432e5..03dba6685a 100755 --- a/t/t7812-grep-icase-non-ascii.sh +++ b/t/t7812-grep-icase-non-ascii.sh @@ -76,9 +76,12 @@ test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep non-ASCII from invali test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep non-ASCII from invalid UTF-8 data with -i' ' test_might_fail git grep -hi "Æ" invalid-0x80 >actual && - test_cmp expected actual && + if test -s actual + then + test_cmp expected actual + fi && test_must_fail git grep -hi "(*NO_JIT)Æ" invalid-0x80 >actual && - test_cmp expected actual + ! test_cmp expected actual ' test_done -- 2.24.0