Re: expand: Do not quote backslashes in unquoted parameter expansion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/03/2018 17:14, Herbert Xu wrote:
On Tue, Mar 27, 2018 at 02:57:12PM +0200, Harald van Dijk wrote:

That's a good point and wouldn't have too much of an impact on performance
of existing scripts. BTW, that means both expandmeta()'s metachars variable,
and expmeta()'s

   if (metaflag == 0 || lstat64(expdir, &statb) >= 0)

optimisation. You already got rid of the latter in your patch to prevent
buffer overflows.

No the purpose of this metaflag check is to bypass the lstat call.
My patch simply made it bypass the call by returning earlier.

Oh, right.

It is not relevant to whether we include backslash as a metacharacter.

I was thinking about not making backslashes set metaflag in expmeta(): when the pathname component doesn't include *, ?, or [, but does include backslashes, then the if (metaflag == 0) block could handle that as long as it performs the lstat64() check unconditionally. There's no need to go through the opendir()/readdir()/closedir() path for that case. Since expmeta() is bypassed for words not containing any potentially-magic characters, the impact might be small enough.

Regardless of whether metaflag is set, it would mean things like 'set "["' would start hitting the FS unnecessarily, if I understand correctly: the preglob()-expanded pattern is '\[', and expmeta() can no longer tell this apart from $v where v='\[' where a FS check would be required.

Perhaps preglob() should just be avoided, and expmeta() taught to respect both '\\' and CTLESC. '\\' would be a metacharacter and require a FS hit, CTLESC wouldn't require it. This would also avoid some memory allocations.

In regular /d\ev that doesn't come from a variable, the backslash will have
been replaced by CTLESC, but it's not necessary to treat CTLESC as a
metacharacter: in that case, either there's a match and pathname expansion
produces /dev, or there's no match and quote removal produces /dev, so the
optimisation is still valid. It'd only affect backslashes that come from a
variable, so it's very limited in scope.

For the case where the backslash came from the parser, the CTLESC
would have already been removed by preglob so there should be no
difference.

    if (!strpbrk(str->text, metachars))
        goto nometa;

in expandmeta() is done before preglob() is called. Here, it is still not necessary to include CTLESC.

The only problem with the metacharacter approach is that glibc's
glob(3) probably won't treat it as a metacharacter and therefore
it will still behave in the same way as the current bash.

This doesn't matter much yet, but yeah, it will if you decide to enable --enable-glob by default in the future. As long as you don't include GLOB_NOESCAPE, then normally, it should be taken as escaping the next character, but it might still trigger GLOB_NOMAGIC to return early. If it does, I'd take that as a glibc bug, but that doesn't help dash.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux