Changes since v2: * Readability improvement to the first patch, which fixes object name resolution with refs containing a curly brace * Fixed the second patch for cases like OBJ-COUNT-gHASH~13 and added a couple test cases for that. Also, extended the commit message a bit to discuss the cases brought up on the list. For the second patch, if folks want some open source examples where it could be triggered, I found two examples: * lore.git: git cat-file -t master:random/path/major-gaffed * git.git: git cat-file -t super-invalid~///\\.....@.lock-gfd0bba94e Elijah Newren (2): object-name: fix resolution of object names containing curly braces object-name: be more strict in parsing describe-like output object-name.c | 63 ++++++++++++++++++++++++++++++++++++++++++--- t/t1006-cat-file.sh | 31 +++++++++++++++++++++- t/t6120-describe.sh | 24 +++++++++++++++++ 3 files changed, 113 insertions(+), 5 deletions(-) base-commit: 063bcebf0c917140ca0e705cbe0fdea127e90086 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1844%2Fnewren%2Fobject-name-fix-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1844/newren/object-name-fix-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1844 Range-diff vs v2: 1: 13f68bebe90 ! 1: b21b8fc5554 object-name: fix resolution of object names containing curly braces @@ Commit message * 'foo@@{...}' * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}' + Note that we'd prefer duplicating the special logic for "@^" characters + here, because if get_oid_basic() or interpret_nth_prior_checkout() or + get_oid_basic() or similar gain extra methods of using curly braces, + then the logic in get_oid_with_context_1() would need to be updated as + well. But it's not clear how to refactor all of these to have a simple + common callpoint with the specialized logic. + Reported-by: Gabriel Amaral <gabriel-amaral@xxxxxxxxxx> Helped-by: Michael Haggerty <mhagger@xxxxxxxxxx> Signed-off-by: Elijah Newren <newren@xxxxxxxxx> @@ object-name.c: static enum get_oid_result get_oid_with_context_1(struct reposito } for (cp = name, bracket_depth = 0; *cp; cp++) { - if (*cp == '{') -+ if (*(cp+1) == '{' && (*cp == '@' || *cp == '^')) { ++ if (strchr("@^", *cp) && cp[1] == '{') { + cp++; bracket_depth++; - else if (bracket_depth && *cp == '}') 2: 31f1c37b31a ! 2: 19f84dfc9cc object-name: be more strict in parsing describe-like output @@ Commit message 'g', and an abbreviated object name. which means that output of the format ${REFNAME}-${INTEGER}-g${HASH} - should parse to fully expand ${HASH}. This is fine. However, we + should parse to fully expanded ${HASH}. This is fine. However, we currently don't validate any of ${REFNAME}-${INTEGER}, we only parse -g${HASH} and assume the rest is valid. That is problematic, since it breaks things like git cat-file -p branchname:path/to/file/named/i-gaffed - which, when commit affed exists, will not return us information about a - file we are looking for but will instead tell us about commit affed. + which, when commit (or tree or blob) affed exists, will not return us + information about the file we are looking for but will instead + erroneously tell us about object affed. - Similarly, we should probably not treat - refs/tags/invalid/./../...../// ~^:/?*\\&[}/busted.lock-g049e0ef6 - as a request for commit 050e0ef6 either. + A few additional notes: + - This is a slight backward incompatibility break, because we used + to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However, + a backward incompatible break is necessary, because there is no + other way for someone to be more specific and disambiguate that they + want the blob master:path/to/who-gabbed instead of the object abbed. + - There is a possibility that check_refname_format() rules change in + the future. However, we can only realistically loosen the rules + for what that function accepts rather than tighten. If we were to + tighten the rules, some real world repositories may already have + refnames that suddenly become unacceptable and we break those + repositories. As such, any describe-like syntax of the form + ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the + changes in this commit will remain valid in the future. + - The fact that check_refname_format() rules could loosen in the + future is probably also an important reason to make this change. If + the rules loosen, there might be additional cases within + ${GARBAGE}-g${HASH} that become ambiguous in the future. While + abbreviated hashes can be disambiguated by abbreviating less, it may + well be that these alternative object names have no way of being + disambiguated (much like pathnames cannot be). Accepting all random + ${GARBAGE} thus makes it difficult for us to allow future + extensions to object naming. - Tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are - present and valid. + So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are + present in the string, and would be considered a valid ref and + non-negative integer. + + Also, add a few tests for git describe using object names of the form + ${REVISION_NAME}${MODIFIERS} + since an early version of this patch failed on constructs like + git describe v2.48.0-rc2-161-g6c2274cdbc^0 Reported-by: Gabriel Amaral <gabriel-amaral@xxxxxxxxxx> Signed-off-by: Elijah Newren <newren@xxxxxxxxx> @@ object-name.c: static int peel_onion(struct repository *r, const char *name, int + len = cp - name; + strbuf_init(&sb, len); + strbuf_add(&sb, name, len); -+ ret = !check_refname_format(name, flags); ++ ret = !check_refname_format(sb.buf, flags); + strbuf_release(&sb); + return ret; +} @@ object-name.c: static int get_describe_name(struct repository *r, return get_short_oid(r, ## t/t6120-describe.sh ## +@@ t/t6120-describe.sh: check_describe R-2-gHASH HEAD^^ + check_describe A-3-gHASH HEAD^^2 + check_describe B HEAD^^2^ + check_describe R-1-gHASH HEAD^^^ ++check_describe R-1-gHASH R-1-g$(git rev-parse --short HEAD^^)~1 + + check_describe c-7-gHASH --tags HEAD + check_describe c-6-gHASH --tags HEAD^ + check_describe e-1-gHASH --tags HEAD^^ + check_describe c-2-gHASH --tags HEAD^^2 ++check_describe c-2-gHASH --tags c-2-g$(git rev-parse --short HEAD^^2)^0 + check_describe B --tags HEAD^^2^ + check_describe e --tags HEAD^^^ + check_describe e --tags --exact-match HEAD^^^ @@ t/t6120-describe.sh: test_expect_success '--exact-match does not show --always fallback' ' test_must_fail git describe --exact-match --always ' -- gitgitgadget