On Thu, Oct 11, 2018 at 05:19:06AM -0500, dana wrote: > Hello, > > I'm a contributor to ripgrep, which is a grep-like tool that supports using > gitignore files to control which files are searched in a repo (or any other > directory tree). ripgrep's support for the patterns in these files is based on > git's official documentation, as seen here: > > https://git-scm.com/docs/gitignore > > One of the most common reports on the ripgrep bug tracker is that it does not > allow patterns like the following real-world examples, where a ** is used along > with other text within the same path component: > > **/**$$*.java > **.orig > **local.properties > !**.sha1 > > The reason it doesn't allow them is that the gitignore documentation explicitly > states that they're invalid: > > ... I've checked the code and run some tests. There is a twist here. "**" is only special when matched in "pathname" mode. That is when the pattern contains at least one slash. In your patterns above, that only applies to the first pattern. When '**' is special, if it's neither '**/', '/**/' or '/**', it _is_ considered invalid (i.e. bad pattern) and the pattern will not match anything. The confusion comes from when '**' is not special for the remaining three patterns, it's considered as regular '*' and still matches stuff. So, I think we have two options. The document could be clarified with something like this -- 8< -- diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt index d107daaffd..500cd43939 100644 --- a/Documentation/gitignore.txt +++ b/Documentation/gitignore.txt @@ -100,7 +100,8 @@ PATTERN FORMAT a shell glob pattern and checks for a match against the pathname relative to the location of the `.gitignore` file (relative to the toplevel of the work tree if not from a - `.gitignore` file). + `.gitignore` file). Note that the "two consecutive asterisks" rule + below does not apply. - Otherwise, Git treats the pattern as a shell glob: "`*`" matches anything except "`/`", "`?`" matches any one character except "`/`" @@ -129,7 +130,8 @@ full pathname may have special meaning: matches zero or more directories. For example, "`a/**/b`" matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on. - - Other consecutive asterisks are considered invalid. + - Other consecutive asterisks are considered invalid and the pattern + is ignored. NOTES ----- -- 8< -- Or we could make the behavior consistent. If '**' is invalid, just consider it two separate regular '*'. Then all four of your patterns will behave the same way. The change for that is quite simple -- 8< -- diff --git a/wildmatch.c b/wildmatch.c index d074c1be10..64087bf02c 100644 --- a/wildmatch.c +++ b/wildmatch.c @@ -104,8 +104,10 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags) dowild(p + 1, text, flags) == WM_MATCH) return WM_MATCH; match_slash = 1; - } else - return WM_ABORT_MALFORMED; + } else { + /* without WM_PATHNAME, '*' == '**' */ + match_slash = flags & WM_PATHNAME ? 0 : 1; + } } else /* without WM_PATHNAME, '*' == '**' */ match_slash = flags & WM_PATHNAME ? 0 : 1; -- 8< -- Which way should we go? I'm leaning towards the second one... -- Duy