On Mon, Nov 20, 2023 at 11:13:45AM -0500, Taylor Blau wrote: > When trying to resolve a revision query like "HEAD~~~~~", our call > pattern looks something like: > > - object-name.c::get_oid_with_context() > - object-name.c::get_oid_1() > - object-name.c::get_nth_ancestor() > - object-name.c::get_oid_1() > - ... > > With `get_nth_ancestor()` and `get_oid_1()` mutually recurring, popping > one '~' off of the revision query for each round of the recursion. One thing I noticed just now is that we have exactly the same problem with `^`, just with a different callstack. This problem isn't yet addressed by your patch. I have to wonder whether we should tighten restrictions even further: instead of manually keeping track of how deep in the stack we are, we limit the length of revisions to at most 1MB. I would claim that this limit is sufficiently large to never be a problem in practice. Revisions are limited to 4kB on most platforms anyway due to the maximum path length. And if somebody really wants to request the (1024 * 1024) + 1th parent, they shouldn't do that by appending this many "~" or "^" chars, but instead they should ask for "~1048577" or "^1048577". I realize that this is much more restrictive than the current patch. But it would be a good defensive mechanism against all kinds of weird revs, and I am very certain that there are other ways to blow the stack or cause out-of-bounds reads or writes here. Patrick > Since this recursive behavior is unbounded, having too many "~"'s > contained in a revision query will cause us to blow the stack. > Generating a message like this when compiled under SANITIZE=address: > > $ valgrind git rev-parse "HEAD$(perl -e "print \"~\" x 1000000000000")" > ==597453== Memcheck, a memory error detector > ==597453== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. > ==597453== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info > ==597453== Command: /home/ttaylorr/local/bin/git.compile diff HEAD~~~~~~~~~~~~[...] > ==597453== > AddressSanitizer:DEADLYSIGNAL > ================================================================= > ==597453==ERROR: AddressSanitizer: stack-overflow on address 0x7fffdd838ff8 (pc 0x7f2726082748 bp 0x7fffdd839110 sp 0x7fffdd839000 T0) > #0 0x7f2726082748 in __asan::GetTLSFakeStack() ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:176 > #1 0x7f2726082748 in GetFakeStackFast ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:193 > #2 0x7f27260833de in OnMalloc ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:207 > #3 0x7f27260833de in __asan_stack_malloc_1 ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:256 > #4 0x563f9077d9d8 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1087 > #5 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295 > #6 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092 > #7 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295 > #8 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092 > [...] > #247 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295 > #248 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092 > > SUMMARY: AddressSanitizer: stack-overflow ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:176 in __asan::GetTLSFakeStack() > ==597453==ABORTING > > (Note that the actual stack is much deeper. GDB reports that the bottom > of the stack looks something like the following): > > #54866 0x0000555555c6d3bf in get_oid_with_context_1 (repo=0x5555563849a0 <the_repo>, name=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., flags=128, prefix=0x0, oid=0x7ffff5713d40, oc=0x7ffff5713d90) at object-name.c:1947 > #54867 0x0000555555c6e2fa in get_oid_with_context (repo=0x5555563849a0 <the_repo>, str=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., flags=128, oid=0x7ffff5713d40, oc=0x7ffff5713d90) at object-name.c:2096 > #54868 0x0000555555d8eed8 in handle_revision_arg_1 (arg_=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., revs=0x7ffff5b000d0, flags=0, revarg_opt=0) at revision.c:2174 > #54869 0x0000555555d8f1a9 in handle_revision_arg (arg=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., revs=0x7ffff5b000d0, flags=0, revarg_opt=0) at revision.c:2189 > #54870 0x0000555555d97ca9 in setup_revisions (argc=2, argv=0x7fffffff4970, revs=0x7ffff5b000d0, opt=0x0) at revision.c:2932 > #54871 0x00005555557d6a63 in cmd_diff (argc=2, argv=0x7fffffff4970, prefix=0x0) at builtin/diff.c:502 > #54872 0x00005555557367bf in run_builtin (p=0x5555561c4c30 <commands+816>, argc=2, argv=0x7fffffff4970) at git.c:469 > #54873 0x000055555573716b in handle_builtin (argc=2, argv=0x7fffffff4970) at git.c:723 > #54874 0x000055555573785a in run_argv (argcp=0x7ffff56028b0, argv=0x7ffff56028e0) at git.c:787 > #54875 0x0000555555738626 in cmd_main (argc=2, argv=0x7fffffff4970) at git.c:922 > #54876 0x00005555559d3fdd in main (argc=3, argv=0x7fffffff4968) at common-main.c:62 > > Fortunately, we can impose a limit on the maximum recursion depth we're > willing to accept when resolving queries like the above without > significantly impeding users. This patch sets the limit at 4096, though > we could probably increase that limit depending on the size of each > frame. > > The limit introduced here is large enough that any reasonable query > should still run to completion, but small enough that if the frame size > were to significantly increase, our protection would still be effective. > > The change here is straightforward: each call to get_nth_ancestor() > increases a counter, and then decrements that counter before returning. > > The diff is a little noisy since there are a handful of return paths > from `get_nth_ancestor()`, all of which need to decrement the depth > variable. > > Since this is a local-only exploit, a user would have to be tricked into > running such a query by an adversary. Even if they were successfully > tricked into running the malicious query, the blast radius is limited to > a local stack overflow, which does not have meaningful paths to remote > code execution, arbitrary memory reads, or any more grave security > concerns. > > Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@xxxxxxxxx> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> > --- > object-name.c | 26 ++++++++++++++++++++------ > t/t1506-rev-parse-diagnosis.sh | 5 +++++ > 2 files changed, 25 insertions(+), 6 deletions(-) > > diff --git a/object-name.c b/object-name.c > index 0bfa29dbbf..675e0a759e 100644 > --- a/object-name.c > +++ b/object-name.c > @@ -1080,6 +1080,9 @@ static enum get_oid_result get_parent(struct repository *r, > return MISSING_OBJECT; > } > > +static int get_nth_ancestor_max_depth = 4096; > +static int get_nth_ancestor_curr_depth; > + > static enum get_oid_result get_nth_ancestor(struct repository *r, > const char *name, int len, > struct object_id *result, > @@ -1089,20 +1092,31 @@ static enum get_oid_result get_nth_ancestor(struct repository *r, > struct commit *commit; > int ret; > > + if (++get_nth_ancestor_curr_depth > get_nth_ancestor_max_depth) > + return error(_("exceeded maximum ancestor depth")); > + > ret = get_oid_1(r, name, len, &oid, GET_OID_COMMITTISH); > if (ret) > - return ret; > + goto done; > commit = lookup_commit_reference(r, &oid); > - if (!commit) > - return MISSING_OBJECT; > + if (!commit) { > + ret = MISSING_OBJECT; > + goto done; > + } > > while (generation--) { > - if (repo_parse_commit(r, commit) || !commit->parents) > - return MISSING_OBJECT; > + if (repo_parse_commit(r, commit) || !commit->parents) { > + ret = MISSING_OBJECT; > + goto done; > + } > commit = commit->parents->item; > } > oidcpy(result, &commit->object.oid); > - return FOUND; > + > + ret = FOUND; > +done: > + get_nth_ancestor_curr_depth--; > + return ret; > } > > struct object *repo_peel_to_type(struct repository *r, const char *name, int namelen, > diff --git a/t/t1506-rev-parse-diagnosis.sh b/t/t1506-rev-parse-diagnosis.sh > index ef40511d89..b3b9f6c8c5 100755 > --- a/t/t1506-rev-parse-diagnosis.sh > +++ b/t/t1506-rev-parse-diagnosis.sh > @@ -244,6 +244,11 @@ test_expect_success 'reject Nth ancestor if N is too high' ' > test_must_fail git rev-parse HEAD~100000000000000000000000000000000 > ' > > +test_expect_success 'reject too-deep recursive ancestor queries' ' > + test_must_fail git rev-parse "HEAD$(perl -e "print \"~\" x 4097")" 2>err && > + grep "error: exceeded maximum ancestor depth" err > +' > + > test_expect_success 'pathspecs with wildcards are not ambiguous' ' > echo "*.c" >expect && > git rev-parse "*.c" >actual && > > base-commit: cfb8a6e9a93adbe81efca66e6110c9b4d2e57169 > -- > 2.43.0.rc2.19.geadd45bf00
Attachment:
signature.asc
Description: PGP signature