[PATCH] t7812: expect failure for grep -i with invalid UTF-8 data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When the 'grep with invalid UTF-8 data' tests were added/adjusted in
8a5999838e (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26)
and 870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching,
2019-07-26) they lacked a redirect which caused them to falsely succeed
on most architectures.  They failed on big-endian arches where the test
never reached the portion which was missing the redirect.

A recent patch add the missing redirect and exposed the fact that the
'PCRE v2: grep non-ASCII from invalid UTF-8 data with -i' test fails on
all architectures.

Based on the final paragraph in in 870eea8166:

    When grepping a non-ASCII fixed string. This is a more general problem
    that's hard to fix, but we can at least fix the most common case of
    grepping for a fixed string without "-i". I can't think of a reason
    for why we'd turn on PCRE2_UTF when matching byte-for-byte like that.

it seems that we don't expect that the case-insensitive grep will
succeed.  Adjust the test to reflect that expectation.

Signed-off-by: Todd Zullinger <tmz@xxxxxxxxx>
---

Hi,

Andreas Schwab wrote:
> Two tests in t7812 were missing redirects, failing to actually test the
> produced output.
> 
> Fixes: 8a5999838e ("grep: stess test PCRE v2 on invalid UTF-8 data")
> Signed-off-by: Andreas Schwab <schwab@xxxxxxxxxxxxxx>

Nice catch on these missing redirects.  While testing the
2.24.0 rc's this test failed only on big-endian arches (like
s390x in Fedora).  It was briefly mentioned at the time in
<20191020002648.GZ10893@xxxxxxxxx>.

After applying the fix to add the missing redirects, the
test now fails on all architectures.  It seems like that is
the intended state and we simply need to adjust the final
test with `grep -i` accordingly.

Of course, it's possible that I've misunderstood Ævar's
intentions in 870eea8166 (grep: do not enter PCRE2_UTF mode
on fixed matching, 2019-07-26) or otherwise have poorly
explained the reasoning in my commit message.

We don't appear to test PCRE (neither v1 nor v2) in our CI
builds.  I added libpcre2-dev and set USE_LIBPCRE to test
this in travis.  (Whether we want to enable that by default
is worth discussing.)

Doing so confirms that the tests (incorrectly) pass without
the missing redirects and fail when the redirects are added.
Inverting the expectation of test_cmp in the final `grep -i`
test results in a successful test run.

with PCRE2:
https://travis-ci.org/tmzullinger/git/jobs/618733182

with PCRE2 and 'add missing redirects':
https://travis-ci.org/tmzullinger/git/jobs/618723277

with PCRE2, 'add missing redirects' and this patch:
https://travis-ci.org/tmzullinger/git/jobs/618796963

This still doesn't fix the failure on s390x.  There

    test_might_fail git grep -hi "Æ" invalid-0x80 >actual

fails and leaves nothing in 'actual' which causes the test
to fail at the following

    test_cmp expected actual

line.  If we suspect the test might fail, it seems like we
need to account for that and not expect that 'expect' will
match 'actual' as we currently do.

 t/t7812-grep-icase-non-ascii.sh | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh
index c4528432e5..03dba6685a 100755
--- a/t/t7812-grep-icase-non-ascii.sh
+++ b/t/t7812-grep-icase-non-ascii.sh
@@ -76,9 +76,12 @@ test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep non-ASCII from invali
 
 test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep non-ASCII from invalid UTF-8 data with -i' '
 	test_might_fail git grep -hi "Æ" invalid-0x80 >actual &&
-	test_cmp expected actual &&
+	if test -s actual
+	then
+	    test_cmp expected actual
+	fi &&
 	test_must_fail git grep -hi "(*NO_JIT)Æ" invalid-0x80 >actual &&
-	test_cmp expected actual
+	! test_cmp expected actual
 '
 
 test_done
-- 
2.24.0





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux