Re: [PATCH v3] line-range: Fix infinite loop bug with degenerate '$' regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lars Kellogg-Stedman <lars@xxxxxxxxxx> writes:

> When the -L argument to "git log" is passed the degenerate regular
> expression "$" (as in "-L :$:line-range.c"), this results in an
> infinite loop in find_funcname_matching_regexp().

Is "matching an empty line" the only way a regular expression can be
a "degenerate" one?  If not, perhaps being a bit more explicit would
help the readers, e.g.

    ... a regular expression that matches any line, even an empty
    one, such as "-L :$:line-range.c", this results in ...

> The primary change is that we pre-decrement the beginning-of-line
> marker ('bol') before comparing it to '\n'. In the case of '$', where
> we start with bol == eol, this ensures that bol will find the
> beginning of the line on which the match occurred.

This clear explanation probably deserves to be in the commit log
message proper.

> @@ -148,8 +148,7 @@ static const char *find_funcname_matching_regexp(xdemitconf_t *xecfg, const char
>  		/* determine extent of line matched */
>  		bol = start+match[0].rm_so;
>  		eol = start+match[0].rm_eo;
> -		while (bol > start && *bol != '\n')
> -			bol--;
> +		while (bol > start && *--bol != '\n');

Please write it on two lines, and highlight an empty body of the
loop, like so

		while (condition)
			; /* nothing */

I think the intention of the above loop is to decrement bol
sufficiently and make it point at the terminating LF at the end
of the previous line, and then the next increment here

>  		if (*bol == '\n')
>  			bol++;

compensates to bring bol back to point at the beginning of line.

In the iteration of the loop when bol == start + 1, we inspect
*--bol (i.e.  *start).  If it is LF, we exit the loop, so bol ==
start and *bol == '\n'.  If it is not LF, we iterate one more time,
notice bol == start and exit the loop, so bol == start and *bol !=
'\n'.  bol will never go below start, so dereference *bol to see it
value where should be safe without OOB read.

> @@ -161,6 +160,7 @@ static const char *find_funcname_matching_regexp(xdemitconf_t *xecfg, const char
>  			return bol;
>  		start = eol;
>  	}
> +	return NULL;
>  }

What is this change about?  Isn't the above an endless loop without
break, from which the only way for the control to leave it is by
returning?

> diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
> index ac9e4d0928..19db07a8df 100755
> --- a/t/t4211-line-log.sh
> +++ b/t/t4211-line-log.sh
> @@ -315,4 +315,26 @@ test_expect_success 'line-log with --before' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup tests for zero-width regular expressions' '
> +	cat > expect <<-EOF

Style: lose the SP after "cat >".

> +	Modify func1() in file.c
> +	Add func1() and func2() in file.c
> +	EOF
> +'
> +
> +test_expect_success 'zero-width regex $ matches any function name' '
> +	git log --format="%s" --no-patch "-L:$:file.c" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'zero-width regex ^ matches any function name' '
> +	git log --format="%s" --no-patch "-L:^:file.c" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'zero-width regex . matches any function name' '

"." is one character wide, not zero-width.  Did you mean ".*"?

> +	git log --format="%s" --no-patch "-L:.:file.c" >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux