Re: [PATCH 3/4] check-non-portable-shell: improve `VAR=val shell-func` detection

Eric Sunshine <sunshine@xxxxxxxxxxxxxx> · Fri, 26 Jul 2024 02:45:59 -0400

On Mon, Jul 22, 2024 at 10:46 AM Rubén Justo <rjusto@xxxxxxxxx> wrote:
> On Mon, Jul 22, 2024 at 02:59:13AM -0400, Eric Sunshine wrote:
> > -     /^\s*([A-Z0-9_]+=(\w*|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
> > +     /\b([A-Z0-9_]+=(\w*|(["']).*?\3)\s+)+(\w+)/ and !/test_env.+=/ and exists($func{$4}) and
>
> Losing "^\s*" means we'll cause false positives, such as:
>
>     # VAR=VAL shell-func
>     echo VAR=VAL shell-func

True, though, considering that "shell-func" in these examples must
match the name of a function actually defined in one of the input
files, one would expect (or at least hope) that this sort of
false-positive will be exceedingly rare. Indeed, there are no such
false-positives in the existing test scripts. Of course, we can always
tighten the regex later if it proves to be problematic.

> Regardless of that, the regex will continue to pose problems with:
>
>   VAR=$OTHER_VALUE shell-func
>   VAR=$(cmd) shell-func
>   VAR=VAL\ UE shell-func
>   VAR="\"val\" shell-func UE" non-shell-func
>
> Which, of course, should be cases that should be written in a more
> orthodox way.

Yes, it can be difficult to be thorough when "linting" a programming
language merely via regular-expressions, and this particular
expression is already almost unreadable. The effort involved in trying
to make it perfect may very well outweigh the potential gain in
coverage.

> But we will start to detect errors like the ones mentioned in the
> message, which are more likely to happen.

Indeed.