:(glob)path differs from plain pathspec that it uses wildmatch with WM_PATHNAME while the other uses fnmatch without FNM_PATHNAME. The difference lies in how '*' (and '**') is processed. With the introduction of :(glob) and :(literal) and their global options --[no]glob-pathspecs, the user can: - make everything literal by default via --noglob-pathspecs --literal-pathspecs cannot be used for this purpose as it disables _all_ pathspec magic. - individually turn on globbing with :(glob) - make everything globbing by default via --glob-pathspecs - individually turn off globbing with :(literal) The implication behind this is, there is no way to gain the default matching behavior (i.e. fnmatch without FNM_PATHNAME). You either get new globbing or literal. The old fnmatch behavior is considered deprecated and discouraged to use. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> --- Documentation/git.txt | 19 ++++++++++++ Documentation/glossary-content.txt | 29 ++++++++++++++++++ builtin/add.c | 9 ++++-- builtin/ls-tree.c | 2 +- cache.h | 2 ++ dir.c | 28 +++++++++-------- dir.h | 9 +++--- git.c | 8 +++++ pathspec.c | 47 +++++++++++++++++++++++++--- pathspec.h | 4 ++- t/t6130-pathspec-noglob.sh | 63 ++++++++++++++++++++++++++++++++++++++ tree-walk.c | 9 +++--- 12 files changed, 198 insertions(+), 31 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index a3fbc59..52b85b6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -452,6 +452,17 @@ help ...`. This is equivalent to setting the `GIT_LITERAL_PATHSPECS` environment variable to `1`. +--glob-pathspecs: + Add "glob" magic to all pathspec. This is equivalent to setting + the `GIT_GLOB_PATHSPECS` environment variable to `1`. Disabling + globbing on individual pathspecs can be done using pathspec + magic ":(literal)" + +--noglob-pathspecs: + Add "literal" magic to all pathspec. This is equivalent to setting + the `GIT_NOGLOB_PATHSPECS` environment variable to `1`. Enabling + globbing on individual pathspecs can be done using pathspec + magic ":(glob)" GIT COMMANDS ------------ @@ -847,6 +858,14 @@ GIT_LITERAL_PATHSPECS:: literal paths to Git (e.g., paths previously given to you by `git ls-tree`, `--raw` diff output, etc). +GIT_GLOB_PATHSPECS:: + Setting this variable to `1` will cause Git to treat all + pathspecs as glob patterns (aka "glob" magic). + +GIT_NOGLOB_PATHSPECS:: + Setting this variable to `1` will cause Git to treat all + pathspecs as literal (aka "literal" magic). + Discussion[[Discussion]] ------------------------ diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index 186b566..5c8da50 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -333,6 +333,35 @@ top `/`;; literal;; Wildcards in the pattern such as `*` or `?` are treated as literal characters. + +glob;; + Git treats the pattern as a shell glob suitable for + consumption by fnmatch(3) with the FNM_PATHNAME flag: + wildcards in the pattern will not match a / in the pathname. + For example, "Documentation/{asterisk}.html" matches + "Documentation/git.html" but not "Documentation/ppc/ppc.html" + or "tools/perf/Documentation/perf.html". ++ +Two consecutive asterisks ("`**`") in patterns matched against +full pathname may have special meaning: + + - A leading "`**`" followed by a slash means match in all + directories. For example, "`**/foo`" matches file or directory + "`foo`" anywhere, the same as pattern "`foo`". "**/foo/bar" + matches file or directory "`bar`" anywhere that is directly + under directory "`foo`". + + - A trailing "/**" matches everything inside. For example, + "abc/**" matches all files inside directory "abc", relative + to the location of the `.gitignore` file, with infinite depth. + + - A slash followed by two consecutive asterisks then a slash + matches zero or more directories. For example, "`a/**/b`" + matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on. + + - Other consecutive asterisks are considered invalid. ++ +Glob magic is incompatible with literal magic. -- + Currently only the slash `/` is recognized as the "magic signature", diff --git a/builtin/add.c b/builtin/add.c index 663ddd1..1dab246 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -541,11 +541,16 @@ int cmd_add(int argc, const char **argv, const char *prefix) /* * file_exists() assumes exact match */ - GUARD_PATHSPEC(&pathspec, PATHSPEC_FROMTOP | PATHSPEC_LITERAL); + GUARD_PATHSPEC(&pathspec, + PATHSPEC_FROMTOP | + PATHSPEC_LITERAL | + PATHSPEC_GLOB); for (i = 0; i < pathspec.nr; i++) { const char *path = pathspec.items[i].match; - if (!seen[i] && !file_exists(path)) { + if (!seen[i] && + ((pathspec.items[i].magic & PATHSPEC_GLOB) || + !file_exists(path))) { if (ignore_missing) { int dtype = DT_UNKNOWN; if (is_excluded(&dir, path, &dtype)) diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index bdb03f3..1056634 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -173,7 +173,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix) * cannot be lifted until it is converted to use * match_pathspec_depth() or tree_entry_interesting() */ - parse_pathspec(&pathspec, 0, + parse_pathspec(&pathspec, PATHSPEC_GLOB, PATHSPEC_PREFER_CWD, prefix, argv + 1); for (i = 0; i < pathspec.nr; i++) diff --git a/cache.h b/cache.h index 276330e..c844805 100644 --- a/cache.h +++ b/cache.h @@ -361,6 +361,8 @@ static inline enum object_type object_type(unsigned int mode) #define GIT_NOTES_REWRITE_REF_ENVIRONMENT "GIT_NOTES_REWRITE_REF" #define GIT_NOTES_REWRITE_MODE_ENVIRONMENT "GIT_NOTES_REWRITE_MODE" #define GIT_LITERAL_PATHSPECS_ENVIRONMENT "GIT_LITERAL_PATHSPECS" +#define GIT_GLOB_PATHSPECS_ENVIRONMENT "GIT_GLOB_PATHSPECS" +#define GIT_NOGLOB_PATHSPECS_ENVIRONMENT "GIT_NOGLOB_PATHSPECS" /* * This environment variable is expected to contain a boolean indicating diff --git a/dir.c b/dir.c index 65cac36..7e931af 100644 --- a/dir.c +++ b/dir.c @@ -52,26 +52,28 @@ int fnmatch_icase(const char *pattern, const char *string, int flags) return fnmatch(pattern, string, flags | (ignore_case ? FNM_CASEFOLD : 0)); } -inline int git_fnmatch(const char *pattern, const char *string, - int flags, int prefix) +inline int git_fnmatch(const struct pathspec_item *item, + const char *pattern, const char *string, + int prefix) { - int fnm_flags = 0; - if (flags & GFNM_PATHNAME) - fnm_flags |= FNM_PATHNAME; if (prefix > 0) { if (strncmp(pattern, string, prefix)) return FNM_NOMATCH; pattern += prefix; string += prefix; } - if (flags & GFNM_ONESTAR) { + if (item->flags & PATHSPEC_ONESTAR) { int pattern_len = strlen(++pattern); int string_len = strlen(string); return string_len < pattern_len || strcmp(pattern, string + string_len - pattern_len); } - return fnmatch(pattern, string, fnm_flags); + if (item->magic & PATHSPEC_GLOB) + return wildmatch(pattern, string, WM_PATHNAME, NULL); + else + /* wildmatch has not learned no FNM_PATHNAME mode yet */ + return fnmatch(pattern, string, 0); } static int fnmatch_icase_mem(const char *pattern, int patternlen, @@ -111,7 +113,8 @@ static size_t common_prefix_len(const struct pathspec *pathspec) GUARD_PATHSPEC(pathspec, PATHSPEC_FROMTOP | PATHSPEC_MAXDEPTH | - PATHSPEC_LITERAL); + PATHSPEC_LITERAL | + PATHSPEC_GLOB); for (n = 0; n < pathspec->nr; n++) { size_t i = 0, len = 0; @@ -206,8 +209,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix, } if (item->nowildcard_len < item->len && - !git_fnmatch(match, name, - item->flags & PATHSPEC_ONESTAR ? GFNM_ONESTAR : 0, + !git_fnmatch(item, match, name, item->nowildcard_len - prefix)) return MATCHED_FNMATCH; @@ -238,7 +240,8 @@ int match_pathspec_depth(const struct pathspec *ps, GUARD_PATHSPEC(ps, PATHSPEC_FROMTOP | PATHSPEC_MAXDEPTH | - PATHSPEC_LITERAL); + PATHSPEC_LITERAL | + PATHSPEC_GLOB); if (!ps->nr) { if (!ps->recursive || @@ -1299,7 +1302,8 @@ int read_directory(struct dir_struct *dir, const char *path, int len, const stru GUARD_PATHSPEC(pathspec, PATHSPEC_FROMTOP | PATHSPEC_MAXDEPTH | - PATHSPEC_LITERAL); + PATHSPEC_LITERAL | + PATHSPEC_GLOB); if (has_symlink_leading_path(path, len)) return dir->nr; diff --git a/dir.h b/dir.h index 7d051e3..343ec7a 100644 --- a/dir.h +++ b/dir.h @@ -199,10 +199,9 @@ extern int fnmatch_icase(const char *pattern, const char *string, int flags); /* * The prefix part of pattern must not contains wildcards. */ -#define GFNM_PATHNAME 1 /* similar to FNM_PATHNAME */ -#define GFNM_ONESTAR 2 /* there is only _one_ wildcard, a star */ - -extern int git_fnmatch(const char *pattern, const char *string, - int flags, int prefix); +struct pathspec_item; +extern int git_fnmatch(const struct pathspec_item *item, + const char *pattern, const char *string, + int prefix); #endif diff --git a/git.c b/git.c index 7dd07aa..b43da7d 100644 --- a/git.c +++ b/git.c @@ -146,6 +146,14 @@ static int handle_options(const char ***argv, int *argc, int *envchanged) setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "0", 1); if (envchanged) *envchanged = 1; + } else if (!strcmp(cmd, "--glob-pathspecs")) { + setenv(GIT_GLOB_PATHSPECS_ENVIRONMENT, "1", 1); + if (envchanged) + *envchanged = 1; + } else if (!strcmp(cmd, "--noglob-pathspecs")) { + setenv(GIT_NOGLOB_PATHSPECS_ENVIRONMENT, "1", 1); + if (envchanged) + *envchanged = 1; } else { fprintf(stderr, "Unknown option: %s\n", cmd); usage(git_usage_string); diff --git a/pathspec.c b/pathspec.c index 9802829..e3581da 100644 --- a/pathspec.c +++ b/pathspec.c @@ -57,7 +57,6 @@ char *find_pathspecs_matching_against_index(const struct pathspec *pathspec) * * Possible future magic semantics include stuff like: * - * { PATHSPEC_NOGLOB, '!', "noglob" }, * { PATHSPEC_ICASE, '\0', "icase" }, * { PATHSPEC_RECURSIVE, '*', "recursive" }, * { PATHSPEC_REGEXP, '\0', "regexp" }, @@ -71,6 +70,7 @@ static struct pathspec_magic { } pathspec_magic[] = { { PATHSPEC_FROMTOP, '/', "top" }, { PATHSPEC_LITERAL, 0, "literal" }, + { PATHSPEC_GLOB, '\0', "glob" }, }; /* @@ -93,6 +93,8 @@ static unsigned prefix_pathspec(struct pathspec_item *item, const char *elt) { static int literal_global = -1; + static int glob_global = -1; + static int noglob_global = -1; unsigned magic = 0, short_magic = 0, global_magic = 0; const char *copyfrom = elt, *long_magic_end = NULL; char *match; @@ -103,6 +105,22 @@ static unsigned prefix_pathspec(struct pathspec_item *item, if (literal_global) global_magic |= PATHSPEC_LITERAL; + if (glob_global < 0) + glob_global = git_env_bool(GIT_GLOB_PATHSPECS_ENVIRONMENT, 0); + if (glob_global) + global_magic |= PATHSPEC_GLOB; + + if (noglob_global < 0) + noglob_global = git_env_bool(GIT_NOGLOB_PATHSPECS_ENVIRONMENT, 0); + + if (glob_global && noglob_global) + die(_("global 'glob' and 'noglob' pathspec settings are incompatible")); + + if ((global_magic & PATHSPEC_LITERAL) && + (global_magic & ~PATHSPEC_LITERAL)) + die(_("global 'literal' pathspec setting is incompatible " + "with all other global pathspec settings")); + if (elt[0] != ':' || literal_global) { ; /* nothing to do */ } else if (elt[1] == '(') { @@ -167,12 +185,24 @@ static unsigned prefix_pathspec(struct pathspec_item *item, magic |= short_magic; *p_short_magic = short_magic; + + /* --noglob-pathspec adds :(literal) _unless_ :(glob) is specifed */ + if (noglob_global && !(magic & PATHSPEC_GLOB)) + global_magic |= PATHSPEC_LITERAL; + + /* --glob-pathspec is overriden by :(literal) */ + if ((global_magic & PATHSPEC_GLOB) && (magic & PATHSPEC_LITERAL)) + global_magic &= ~PATHSPEC_GLOB; + magic |= global_magic; if (pathspec_prefix >= 0 && (prefixlen || (prefix && *prefix))) die("BUG: 'prefix' magic is supposed to be used at worktree's root"); + if ((magic & PATHSPEC_LITERAL) && (magic & PATHSPEC_GLOB)) + die(_("%s: 'literal' and 'glob' are incompatible"), elt); + if (pathspec_prefix >= 0) { match = xstrdup(copyfrom); prefixlen = pathspec_prefix; @@ -248,10 +278,17 @@ static unsigned prefix_pathspec(struct pathspec_item *item, item->nowildcard_len = prefixlen; } item->flags = 0; - if (item->nowildcard_len < item->len && - item->match[item->nowildcard_len] == '*' && - no_wildcard(item->match + item->nowildcard_len + 1)) - item->flags |= PATHSPEC_ONESTAR; + if (magic & PATHSPEC_GLOB) { + /* + * FIXME: should we enable ONESTAR in _GLOB for + * pattern "* * / * . c"? + */ + } else { + if (item->nowildcard_len < item->len && + item->match[item->nowildcard_len] == '*' && + no_wildcard(item->match + item->nowildcard_len + 1)) + item->flags |= PATHSPEC_ONESTAR; + } /* sanity checks, pathspec matchers assume these are sane */ assert(item->nowildcard_len <= item->len && diff --git a/pathspec.h b/pathspec.h index ec15fe6..7608172 100644 --- a/pathspec.h +++ b/pathspec.h @@ -5,10 +5,12 @@ #define PATHSPEC_FROMTOP (1<<0) #define PATHSPEC_MAXDEPTH (1<<1) #define PATHSPEC_LITERAL (1<<2) +#define PATHSPEC_GLOB (1<<3) #define PATHSPEC_ALL_MAGIC \ (PATHSPEC_FROMTOP | \ PATHSPEC_MAXDEPTH | \ - PATHSPEC_LITERAL) + PATHSPEC_LITERAL | \ + PATHSPEC_GLOB) #define PATHSPEC_ONESTAR 1 /* the pathspec pattern sastisfies GFNM_ONESTAR */ diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh index 8551b02..ea00d71 100755 --- a/t/t6130-pathspec-noglob.sh +++ b/t/t6130-pathspec-noglob.sh @@ -32,6 +32,16 @@ test_expect_success 'star pathspec globs' ' test_cmp expect actual ' +test_expect_success 'star pathspec globs' ' + cat >expect <<-\EOF && + bracket + star + vanilla + EOF + git log --format=%s -- ":(glob)f*" >actual && + test_cmp expect actual +' + test_expect_success 'bracket pathspec globs and matches literal brackets' ' cat >expect <<-\EOF && bracket @@ -41,6 +51,15 @@ test_expect_success 'bracket pathspec globs and matches literal brackets' ' test_cmp expect actual ' +test_expect_success 'bracket pathspec globs and matches literal brackets' ' + cat >expect <<-\EOF && + bracket + vanilla + EOF + git log --format=%s -- ":(glob)f[o][o]" >actual && + test_cmp expect actual +' + test_expect_success 'no-glob option matches literally (vanilla)' ' echo vanilla >expect && git --literal-pathspecs log --format=%s -- foo >actual && @@ -89,4 +108,48 @@ test_expect_success 'no-glob environment variable works' ' test_cmp expect actual ' +test_expect_success 'setup xxx/bar' ' + mkdir xxx && + test_commit xxx xxx/bar +' + +test_expect_success '**/ works with :(glob)' ' + cat >expect <<-\EOF && + xxx + unrelated + EOF + git log --format=%s -- ":(glob)**/bar" >actual && + test_cmp expect actual +' + +test_expect_success '**/ does not work with --noglob-pathspecs' ' + : >expect && + git --noglob-pathspecs log --format=%s -- "**/bar" >actual && + test_cmp expect actual +' + +test_expect_success '**/ works with :(glob) and --noglob-pathspecs' ' + cat >expect <<-\EOF && + xxx + unrelated + EOF + git --noglob-pathspecs log --format=%s -- ":(glob)**/bar" >actual && + test_cmp expect actual +' + +test_expect_success '**/ works with --glob-pathspecs' ' + cat >expect <<-\EOF && + xxx + unrelated + EOF + git --glob-pathspecs log --format=%s -- "**/bar" >actual && + test_cmp expect actual +' + +test_expect_success '**/ does not work with :(literal) and --glob-pathspecs' ' + : >expect && + git --glob-pathspecs log --format=%s -- ":(literal)**/bar" >actual && + test_cmp expect actual +' + test_done diff --git a/tree-walk.c b/tree-walk.c index 676bd7f..a44f528 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -639,7 +639,8 @@ enum interesting tree_entry_interesting(const struct name_entry *entry, GUARD_PATHSPEC(ps, PATHSPEC_FROMTOP | PATHSPEC_MAXDEPTH | - PATHSPEC_LITERAL); + PATHSPEC_LITERAL | + PATHSPEC_GLOB); if (!ps->nr) { if (!ps->recursive || @@ -685,8 +686,7 @@ enum interesting tree_entry_interesting(const struct name_entry *entry, return entry_interesting; if (item->nowildcard_len < item->len) { - if (!git_fnmatch(match + baselen, entry->path, - item->flags & PATHSPEC_ONESTAR ? GFNM_ONESTAR : 0, + if (!git_fnmatch(item, match + baselen, entry->path, item->nowildcard_len - baselen)) return entry_interesting; @@ -727,8 +727,7 @@ match_wildcards: strbuf_add(base, entry->path, pathlen); - if (!git_fnmatch(match, base->buf + base_offset, - item->flags & PATHSPEC_ONESTAR ? GFNM_ONESTAR : 0, + if (!git_fnmatch(item, match, base->buf + base_offset, item->nowildcard_len)) { strbuf_setlen(base, base_offset + baselen); return entry_interesting; -- 1.8.2.83.gc99314b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html