Re: Bug or unexpected behaviour in git show <rev>:a\b

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 24, 2020 at 11:27:35AM -0800, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > So everything is working as designed, or at least explainable. But I
> > think there is some room for improvement. A backslash that isn't
> > followed by a glob meta-character _is_ still a meta character (your
> > "a\b" would be globbing for "ab"). But it's useless enough that I think
> > it shouldn't be enough to trigger the "oh, you probably meant this as a
> > pathspec" DWIM rule.
> 
> This sounds sensible.

OK, the patch I came up with is below.

> > We _could_ also say "even though this could technically be a pathspec
> > because of its metacharacter, it looks vaguely enough like a
> > path-in-tree revision that we shouldn't guess". That I'm less
> > comfortable with, just because it makes the heuristics even more
> > magical.
> 
> Not just it becomes more magical, I am afraid that the code to
> implement such a heuristics would be fragile and become a source of
> unnecessary bugs.  Let's not go there.

OK. It does mean that:

  git show HEAD:a*

will still quietly produce no output instead of saying "hey, there is no
a* in HEAD". But I think given the lack of bug reports over the years
that this case (and the backslash one I'm fixing) are probably
relatively rare.  The backslash one seems a lot more likely, just
because Windows folks may treat it like a path separator (I'm not sure
if that even works, considering its meaning in a glob, but certainly I
can imagine somebody doing so as an experiment and getting confused by
the result).

> I should learn to use "working as designed or at least explainable"
> more often in my responses, by the way.  That's quite a useful and
> good phrase ;-)

Perhaps that can be Git's motto. ;)

Anyway, here's the patch. Even though this is rare, I think it's worth
doing. The code is simple and I don't anticipate anybody complaining
about the tightening.

-- >8 --
Subject: verify_filename(): handle backslashes in "wildcards are pathspecs" rule

Commit 28fcc0b71a (pathspec: avoid the need of "--" when wildcard is
used, 2015-05-02) allowed:

  git rev-parse '*.c'

without the double-dash. But the rule it uses to check for wildcards
actually looks for any glob special. This is overly liberal, as it means
that a pattern that doesn't actually do any wildcard matching, like
"a\b", will be considered a pathspec.

If you do have such a file on disk, that's presumably what you wanted.
But if you don't, the results are confusing: rather than say "there's no
such path a\b", we'll quietly accept it as a pathspec which very likely
matches nothing (or at least not what you intended). Likewise, looking
for path "a\*b" doesn't expand the search at all; it would only find a
single entry, "a*b".

This commit switches the rule to trigger only when glob metacharacters
would expand the search, meaning both of those cases will now report an
error (you can still disambiguate using "--", of course; we're just
tightening the DWIM heuristic).

Note that we didn't test the original feature in 28fcc0b71a at all. So
this patch not only tests for these corner cases, but also adds a
regression test for the existing behavior.

Reported-by: David Burström <davidburstrom@xxxxxxxxxxx>
Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 setup.c                        | 23 ++++++++++++++++++++---
 t/t1506-rev-parse-diagnosis.sh | 14 ++++++++++++++
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/setup.c b/setup.c
index e2a479a64f..12228c0d9c 100644
--- a/setup.c
+++ b/setup.c
@@ -197,9 +197,26 @@ static void NORETURN die_verify_filename(struct repository *r,
  */
 static int looks_like_pathspec(const char *arg)
 {
-	/* anything with a wildcard character */
-	if (!no_wildcard(arg))
-		return 1;
+	const char *p;
+	int escaped = 0;
+
+	/*
+	 * Wildcard characters imply the user is looking to match pathspecs
+	 * that aren't in the filesystem. Note that this doesn't include
+	 * backslash even though it's a glob special; by itself it doesn't
+	 * cause any increase in the match. Likewise ignore backslash-escaped
+	 * wildcard characters.
+	 */
+	for (p = arg; *p; p++) {
+		if (escaped) {
+			escaped = 0;
+		} else if (is_glob_special(*p)) {
+			if (*p == '\\')
+				escaped = 1;
+			else
+				return 1;
+		}
+	}
 
 	/* long-form pathspec magic */
 	if (starts_with(arg, ":("))
diff --git a/t/t1506-rev-parse-diagnosis.sh b/t/t1506-rev-parse-diagnosis.sh
index 6d951ca015..8a75f37a11 100755
--- a/t/t1506-rev-parse-diagnosis.sh
+++ b/t/t1506-rev-parse-diagnosis.sh
@@ -222,4 +222,18 @@ test_expect_success 'reject Nth ancestor if N is too high' '
 	test_must_fail git rev-parse HEAD~100000000000000000000000000000000
 '
 
+test_expect_success 'pathspecs with wildcards are not ambiguous' '
+	echo "*.c" >expect &&
+	git rev-parse "*.c" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'backslash does not trigger wildcard rule' '
+	test_must_fail git rev-parse "foo\\bar"
+'
+
+test_expect_success 'escaped char does not trigger wildcard rule' '
+	test_must_fail git rev-parse "foo\\*bar"
+'
+
 test_done
-- 
2.25.0.421.gb74d19af79




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux