Re: BUG: wildmatches like foo/**/**/bar don't match properly due to internal optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/26, Duy Nguyen wrote:
> On Wed, Apr 26, 2017 at 2:13 AM, Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
> > Thought I'd just start another thread for this rather than tack it
> > onto the pathalogical case thread.
> >
> > In commit 4c251e5cb5 ("wildmatch: make /**/ match zero or more
> > directories", 2012-10-15) Duy added support for ** in globs.
> >
> > One test-case for this is:
> >
> >     match 1 0 'foo/baz/bar' 'foo/**/**/bar'
> >
> > I.e. foo/**/**/bar matches foo/baz/bar. However due to some
> > pre-pruning we do in pathspec/ls-tree we can't ever match it, because
> > the first thing we do is peel the first part of the path/pattern off,
> > i.e. foo/, and then match baz/bar against **/**/bar.
> 
> Yeah. I think the prefix compare trick predated wildmatch. When I
> introduced positional wildcards "**/" I failed to spot this. Good
> catch.
> 
> Ideally this sort of optimization should be contained within wildmatch
> (or whatever matching engine we'll be using). It also opens up more
> opportunity (like precompile pattern mentioned elsewhere in this
> thread).
> 
> You need to be careful though, when we do case-insensitive matching,
> sometimes we want to match the prefix case _sensitively_ instead. So
> we need to pass the "prefix" info in some cases to the matching
> engine.
> 
> I guess time is now ripe (i.e. somebody volunteers to work on this ;-)
> to improve wildmatch. "improve" can also be "rewriting to pcre" if we
> really want that route, which I have no opinion because I don't know
> pcre availability on other (some obscure) platforms.

If we do end up improving wildmatch, we may also want the functionality
to (with a flag) have a pattern match a leading directory.  This would
be useful in the submodule case where a user gives a pathspec which
could go into a submodule but we don't want to launch a child process to
operate on the submodule unless the first part of the pattern matches
the submodule.  Right now with recursive grep, if wildcards are used
then the code just punts and says "yeah this pattern may match something
in the submodule but we won't know for sure till we actually try".

-- 
Brandon Williams



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]