Re: [PATCH 2/4] t: remove \{m,n\} from BRE grep usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Đoàn Trần Công Danh  <congdanhqx@xxxxxxxxx> writes:

> \{m,n\} is a GNU extension to BRE, and it's forbidden by our
> CodingGuidelines.

Is it?

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_06

says otherwise.  There may be some other GNU extensions to BRE that
allows you to write ERE elements with different syntax, but I doubt
this is one of them.  Perhaps you are thinking about "A\|B"
alternation?  In ERE "A|B" is alternation, and GNU BRE allows "A\|B"
but that is outside POSIX, IIUC.  "A\+" (1 or more of A) and "A\?"
(0 or 1 of A) are the same way.

We do say we don't use "\{m,n\}" in the guidelines, which was
written more than 10 years ago that codifies the habit acquired
while having to deal with regexp implementations of various UNIX
variants like early SystemV and BSD4 from more than 20 years ago.

If we are using the syntax in many of our tests that everybody runs,
that can be taken as a sign that those platforms who had problems
with the syntax have died out, or at least to them Git does not
matter.

So my prefererence is to

 - Allow \{m,n\} when it makes sense and codify it in the guidelines

 - Rewriting tests is fine if it makes the result easier to read,
   but it shouldn't be done for the sole purpose of getting rid of
   the \{m,n\} syntax.

 - As there are folks without GNU, until these GNU extensions for |,
   +, and ? are adopted widely, keep forbidding their use in BRE.

>  test_expect_success 'git branch -M baz bam should add entries to .git/logs/HEAD' '
>  	msg="Branch: renamed refs/heads/baz to refs/heads/bam" &&
> -	grep " 0\{40\}.*$msg$" .git/logs/HEAD &&
> -	grep "^0\{40\}.*$msg$" .git/logs/HEAD
> +	zero="00000000" &&
> +	zero="$zero$zero$zero$zero$zero" &&
> +	grep " $zero.*$msg$" .git/logs/HEAD &&
> +	grep "^$zero.*$msg$" .git/logs/HEAD
>  '

This is not good

>  test_expect_success 'git branch -M should leave orphaned HEAD alone' '
> diff --git a/t/t3305-notes-fanout.sh b/t/t3305-notes-fanout.sh
> index 22ffe5bcb9..aa3bb2e308 100755
> --- a/t/t3305-notes-fanout.sh
> +++ b/t/t3305-notes-fanout.sh
> @@ -9,7 +9,7 @@ path_has_fanout() {
>  	path=$1 &&
>  	fanout=$2 &&
>  	after_last_slash=$(($(test_oid hexsz) - $fanout * 2)) &&
> -	echo $path | grep -q "^\([0-9a-f]\{2\}/\)\{$fanout\}[0-9a-f]\{$after_last_slash\}$"
> +	echo $path | grep -q -E "^([0-9a-f][0-9a-f]/){$fanout}[0-9a-f]{$after_last_slash}$"

The use of -E makes it more readable and is good.  The innermost "a
pair of hexdigits" that would repeat $fanout times may be easier to
read if you keep the {2}, though.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux