On Fri, Jan 24, 2020 at 11:27:35AM -0800, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > So everything is working as designed, or at least explainable. But I > > think there is some room for improvement. A backslash that isn't > > followed by a glob meta-character _is_ still a meta character (your > > "a\b" would be globbing for "ab"). But it's useless enough that I think > > it shouldn't be enough to trigger the "oh, you probably meant this as a > > pathspec" DWIM rule. > > This sounds sensible. OK, the patch I came up with is below. > > We _could_ also say "even though this could technically be a pathspec > > because of its metacharacter, it looks vaguely enough like a > > path-in-tree revision that we shouldn't guess". That I'm less > > comfortable with, just because it makes the heuristics even more > > magical. > > Not just it becomes more magical, I am afraid that the code to > implement such a heuristics would be fragile and become a source of > unnecessary bugs. Let's not go there. OK. It does mean that: git show HEAD:a* will still quietly produce no output instead of saying "hey, there is no a* in HEAD". But I think given the lack of bug reports over the years that this case (and the backslash one I'm fixing) are probably relatively rare. The backslash one seems a lot more likely, just because Windows folks may treat it like a path separator (I'm not sure if that even works, considering its meaning in a glob, but certainly I can imagine somebody doing so as an experiment and getting confused by the result). > I should learn to use "working as designed or at least explainable" > more often in my responses, by the way. That's quite a useful and > good phrase ;-) Perhaps that can be Git's motto. ;) Anyway, here's the patch. Even though this is rare, I think it's worth doing. The code is simple and I don't anticipate anybody complaining about the tightening. -- >8 -- Subject: verify_filename(): handle backslashes in "wildcards are pathspecs" rule Commit 28fcc0b71a (pathspec: avoid the need of "--" when wildcard is used, 2015-05-02) allowed: git rev-parse '*.c' without the double-dash. But the rule it uses to check for wildcards actually looks for any glob special. This is overly liberal, as it means that a pattern that doesn't actually do any wildcard matching, like "a\b", will be considered a pathspec. If you do have such a file on disk, that's presumably what you wanted. But if you don't, the results are confusing: rather than say "there's no such path a\b", we'll quietly accept it as a pathspec which very likely matches nothing (or at least not what you intended). Likewise, looking for path "a\*b" doesn't expand the search at all; it would only find a single entry, "a*b". This commit switches the rule to trigger only when glob metacharacters would expand the search, meaning both of those cases will now report an error (you can still disambiguate using "--", of course; we're just tightening the DWIM heuristic). Note that we didn't test the original feature in 28fcc0b71a at all. So this patch not only tests for these corner cases, but also adds a regression test for the existing behavior. Reported-by: David Burström <davidburstrom@xxxxxxxxxxx> Signed-off-by: Jeff King <peff@xxxxxxxx> --- setup.c | 23 ++++++++++++++++++++--- t/t1506-rev-parse-diagnosis.sh | 14 ++++++++++++++ 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/setup.c b/setup.c index e2a479a64f..12228c0d9c 100644 --- a/setup.c +++ b/setup.c @@ -197,9 +197,26 @@ static void NORETURN die_verify_filename(struct repository *r, */ static int looks_like_pathspec(const char *arg) { - /* anything with a wildcard character */ - if (!no_wildcard(arg)) - return 1; + const char *p; + int escaped = 0; + + /* + * Wildcard characters imply the user is looking to match pathspecs + * that aren't in the filesystem. Note that this doesn't include + * backslash even though it's a glob special; by itself it doesn't + * cause any increase in the match. Likewise ignore backslash-escaped + * wildcard characters. + */ + for (p = arg; *p; p++) { + if (escaped) { + escaped = 0; + } else if (is_glob_special(*p)) { + if (*p == '\\') + escaped = 1; + else + return 1; + } + } /* long-form pathspec magic */ if (starts_with(arg, ":(")) diff --git a/t/t1506-rev-parse-diagnosis.sh b/t/t1506-rev-parse-diagnosis.sh index 6d951ca015..8a75f37a11 100755 --- a/t/t1506-rev-parse-diagnosis.sh +++ b/t/t1506-rev-parse-diagnosis.sh @@ -222,4 +222,18 @@ test_expect_success 'reject Nth ancestor if N is too high' ' test_must_fail git rev-parse HEAD~100000000000000000000000000000000 ' +test_expect_success 'pathspecs with wildcards are not ambiguous' ' + echo "*.c" >expect && + git rev-parse "*.c" >actual && + test_cmp expect actual +' + +test_expect_success 'backslash does not trigger wildcard rule' ' + test_must_fail git rev-parse "foo\\bar" +' + +test_expect_success 'escaped char does not trigger wildcard rule' ' + test_must_fail git rev-parse "foo\\*bar" +' + test_done -- 2.25.0.421.gb74d19af79