Re: [PATCH v2] t4210: detect REG_ILLSEQ dynamically

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Carlo Marcelo Arenas Belón  <carenas@xxxxxxxxx> writes:

> diff --git a/t/helper/test-regex.c b/t/helper/test-regex.c
> index 10284cc56f..7a8ddce45b 100644
> --- a/t/helper/test-regex.c
> +++ b/t/helper/test-regex.c
> @@ -41,16 +41,21 @@ int cmd__regex(int argc, const char **argv)
>  {
>  	const char *pat;
>  	const char *str;
> -	int flags = 0;
> +	int ret, silent = 0, flags = 0;
>  	regex_t r;
>  	regmatch_t m[1];
> +	char errbuf[64];
>  
>  	if (argc == 2 && !strcmp(argv[1], "--bug"))
>  		return test_regex_bug();
>  	else if (argc < 3)
>  		usage("test-tool regex --bug\n"
> -		      "test-tool regex <pattern> <string> [<options>]");
> +		      "test-tool regex [--silent] <pattern> <string> [<options>]");
>
> +	if (!strcmp(argv[1], "--silent")) {
> +		silent = 1;
> +		argv++;
> +	}

This looks fishy---if argc==3 and the first one is "--silent", only
the <pattern> is left in argv and before taking <string> out of the
argv, we need to ensure argc is still large enough, but I do not
think that is done below:

>  	argv++;
>  	pat = *argv++;
>  	str = *argv++;

So str here would be NULL and/or *argv++ would have given you an
out-of-bounds access already.

> @@ -67,8 +72,14 @@ int cmd__regex(int argc, const char **argv)
>  	}
>  	git_setup_gettext();
>  
> -	if (regcomp(&r, pat, flags))
> -		die("failed regcomp() for pattern '%s'", pat);
> +	ret = regcomp(&r, pat, flags);
> +	if (ret) {
> +		if (silent)
> +			return 1;
> +
> +		regerror(ret, &r, errbuf, sizeof(errbuf));
> +		die("failed regcomp() for pattern '%s' (%s)", pat, errbuf);
> +	}
>  	if (regexec(&r, str, 1, m, 0))
>  		return 1;

Not that it matters _too_ much as this is merely a test helper and
it would not hurt anybody as long as our callers are careful.

> diff --git a/t/t4210-log-i18n.sh b/t/t4210-log-i18n.sh
> index c3792081e6..a89f456817 100755
> --- a/t/t4210-log-i18n.sh
> +++ b/t/t4210-log-i18n.sh
> @@ -10,6 +10,12 @@ latin1_e=$(printf '\351')
>  # invalid UTF-8
>  invalid_e=$(printf '\303\50)') # ")" at end to close opening "("
>  
> +if test_have_prereq GETTEXT_LOCALE &&
> +	! LC_ALL=$is_IS_locale test-tool regex --silent $latin1_e $latin1_e EXTENDED
> +then
> +	have_reg_illseq=1
> +fi

OK.  Have we cleared have_reg_illseq shell variable before we reach
this point?  If not, we should (think: environment variable end user
had before starting the test).

> @@ -56,38 +62,68 @@ test_expect_success !MINGW 'log --grep does
>  	test_must_be_empty actual
>  '
>  
> +trigger_undefined_behaviour()
> +{

Style:

	triggers_undefined_behaviour () {

My first two readings of this patch mistakenly told me that the name
of the function was an instruction to the test to trigger an
undefined behaviour to see what happens, but this helper answers a
question "does the given engine trigger an undefined behaviour (with
the test data we are going to throw at it)?", right?  Perhaps rename
the helper to "triggerS_undefined_behaviour" would reduce the risk
of inviting such a misinterpretation.

> +	local engine=$1
> +
> +	case $engine in
> +	fixed)
> +		if test -n "$have_reg_illseq" &&
> +			! test_have_prereq LIBPCRE2
> +		then
> +			return 0
> +		else
> +			return 1
> +		fi
> +		;;
> +	basic|extended)
> +		if test -n "$have_reg_illseq"
> +		then
> +			return 0
> +		else
> +			return 1
> +		fi
> +		;;
> +	perl)
> +		return 1
> +		;;
> +	esac
> +}

... and the return value is true for "yes it would trigger undefined
behaviour" and false for "no it would not".

>  for engine in fixed basic extended perl
>  do
>  	prereq=
>  	if test $engine = "perl"
>  	then
> +		prereq=PCRE
>  	fi
>  	force_regex=
>  	if test $engine != "fixed"
>  	then
> +		force_regex='.*'
>  	fi
>  
>  	test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not find non-reencoded values (latin1 + locale)" "
> +		LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$utf8_e\" >actual &&

Can we do something to these overlong lines, by the way?

>  		test_must_be_empty actual
>  	"
>  
> +	if ! trigger_undefined_behaviour $engine
> +	then

Much easier to read than the ILLSEQ prerequisite, I would think,
even though the overlong lines are annoying.

> +		test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep searches in log output encoding (latin1 + locale)" "
> +			cat >expect <<-\EOF &&
> +			latin1
> +			utf8
> +			EOF
> +			LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$latin1_e\" >actual &&
> +			test_cmp expect actual
> +		"
> +
> +		test_expect_success !MINGW,GETTEXT_LOCALE,$prereq "-c grep.patternType=$engine log --grep does not die on invalid UTF-8 value (latin1 + locale + invalid needle)" "
> +			LC_ALL=$is_IS_locale git -c grep.patternType=$engine log --encoding=ISO-8859-1 --format=%s --grep=\"$force_regex$invalid_e\" >actual &&
> +			test_must_be_empty actual
> +		"
> +	fi
>  done
>  
>  test_done
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 0ea1e5a05e..81473fea1d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1454,12 +1454,6 @@ case $uname_s in
>  	test_set_prereq SED_STRIPS_CR
>  	test_set_prereq GREP_STRIPS_CR
>  	;;
> -FreeBSD)
> -	test_set_prereq REGEX_ILLSEQ
> -	test_set_prereq POSIXPERM
> -	test_set_prereq BSLASHPSPEC
> -	test_set_prereq EXECKEEPSPID
> -	;;
>  *)
>  	test_set_prereq POSIXPERM
>  	test_set_prereq BSLASHPSPEC

Nice to be able to drop one case arm from here.  Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux