Re: [PATCH] object-name: reject too-deep recursive ancestor queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 20, 2023 at 11:13:45AM -0500, Taylor Blau wrote:
> When trying to resolve a revision query like "HEAD~~~~~", our call
> pattern looks something like:
> 
>   - object-name.c::get_oid_with_context()
>   - object-name.c::get_oid_1()
>   - object-name.c::get_nth_ancestor()
>   - object-name.c::get_oid_1()
>   - ...
> 
> With `get_nth_ancestor()` and `get_oid_1()` mutually recurring, popping
> one '~' off of the revision query for each round of the recursion.

One thing I noticed just now is that we have exactly the same problem
with `^`, just with a different callstack. This problem isn't yet
addressed by your patch.

I have to wonder whether we should tighten restrictions even further:
instead of manually keeping track of how deep in the stack we are, we
limit the length of revisions to at most 1MB. I would claim that this
limit is sufficiently large to never be a problem in practice. Revisions
are limited to 4kB on most platforms anyway due to the maximum path
length. And if somebody really wants to request the (1024 * 1024) + 1th
parent, they shouldn't do that by appending this many "~" or "^" chars,
but instead they should ask for "~1048577" or "^1048577".

I realize that this is much more restrictive than the current patch. But
it would be a good defensive mechanism against all kinds of weird revs,
and I am very certain that there are other ways to blow the stack or
cause out-of-bounds reads or writes here.

Patrick

> Since this recursive behavior is unbounded, having too many "~"'s
> contained in a revision query will cause us to blow the stack.
> Generating a message like this when compiled under SANITIZE=address:
> 
>     $ valgrind git rev-parse "HEAD$(perl -e "print \"~\" x 1000000000000")"
>     ==597453== Memcheck, a memory error detector
>     ==597453== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
>     ==597453== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
>     ==597453== Command: /home/ttaylorr/local/bin/git.compile diff HEAD~~~~~~~~~~~~[...]
>     ==597453==
>     AddressSanitizer:DEADLYSIGNAL
>     =================================================================
>     ==597453==ERROR: AddressSanitizer: stack-overflow on address 0x7fffdd838ff8 (pc 0x7f2726082748 bp 0x7fffdd839110 sp 0x7fffdd839000 T0)
>         #0 0x7f2726082748 in __asan::GetTLSFakeStack() ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:176
>         #1 0x7f2726082748 in GetFakeStackFast ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:193
>         #2 0x7f27260833de in OnMalloc ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:207
>         #3 0x7f27260833de in __asan_stack_malloc_1 ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:256
>         #4 0x563f9077d9d8 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1087
>         #5 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295
>         #6 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092
>         #7 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295
>         #8 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092
>         [...]
>         #247 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295
>         #248 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092
> 
>     SUMMARY: AddressSanitizer: stack-overflow ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:176 in __asan::GetTLSFakeStack()
>     ==597453==ABORTING
> 
> (Note that the actual stack is much deeper. GDB reports that the bottom
> of the stack looks something like the following):
> 
>     #54866 0x0000555555c6d3bf in get_oid_with_context_1 (repo=0x5555563849a0 <the_repo>, name=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., flags=128, prefix=0x0, oid=0x7ffff5713d40, oc=0x7ffff5713d90) at object-name.c:1947
>     #54867 0x0000555555c6e2fa in get_oid_with_context (repo=0x5555563849a0 <the_repo>, str=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., flags=128, oid=0x7ffff5713d40, oc=0x7ffff5713d90) at object-name.c:2096
>     #54868 0x0000555555d8eed8 in handle_revision_arg_1 (arg_=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., revs=0x7ffff5b000d0, flags=0, revarg_opt=0) at revision.c:2174
>     #54869 0x0000555555d8f1a9 in handle_revision_arg (arg=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., revs=0x7ffff5b000d0, flags=0, revarg_opt=0) at revision.c:2189
>     #54870 0x0000555555d97ca9 in setup_revisions (argc=2, argv=0x7fffffff4970, revs=0x7ffff5b000d0, opt=0x0) at revision.c:2932
>     #54871 0x00005555557d6a63 in cmd_diff (argc=2, argv=0x7fffffff4970, prefix=0x0) at builtin/diff.c:502
>     #54872 0x00005555557367bf in run_builtin (p=0x5555561c4c30 <commands+816>, argc=2, argv=0x7fffffff4970) at git.c:469
>     #54873 0x000055555573716b in handle_builtin (argc=2, argv=0x7fffffff4970) at git.c:723
>     #54874 0x000055555573785a in run_argv (argcp=0x7ffff56028b0, argv=0x7ffff56028e0) at git.c:787
>     #54875 0x0000555555738626 in cmd_main (argc=2, argv=0x7fffffff4970) at git.c:922
>     #54876 0x00005555559d3fdd in main (argc=3, argv=0x7fffffff4968) at common-main.c:62
> 
> Fortunately, we can impose a limit on the maximum recursion depth we're
> willing to accept when resolving queries like the above without
> significantly impeding users. This patch sets the limit at 4096, though
> we could probably increase that limit depending on the size of each
> frame.
> 
> The limit introduced here is large enough that any reasonable query
> should still run to completion, but small enough that if the frame size
> were to significantly increase, our protection would still be effective.
> 
> The change here is straightforward: each call to get_nth_ancestor()
> increases a counter, and then decrements that counter before returning.
> 
> The diff is a little noisy since there are a handful of return paths
> from `get_nth_ancestor()`, all of which need to decrement the depth
> variable.
> 
> Since this is a local-only exploit, a user would have to be tricked into
> running such a query by an adversary. Even if they were successfully
> tricked into running the malicious query, the blast radius is limited to
> a local stack overflow, which does not have meaningful paths to remote
> code execution, arbitrary memory reads, or any more grave security
> concerns.
> 
> Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@xxxxxxxxx>
> Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> ---
>  object-name.c                  | 26 ++++++++++++++++++++------
>  t/t1506-rev-parse-diagnosis.sh |  5 +++++
>  2 files changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/object-name.c b/object-name.c
> index 0bfa29dbbf..675e0a759e 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -1080,6 +1080,9 @@ static enum get_oid_result get_parent(struct repository *r,
>  	return MISSING_OBJECT;
>  }
>  
> +static int get_nth_ancestor_max_depth = 4096;
> +static int get_nth_ancestor_curr_depth;
> +
>  static enum get_oid_result get_nth_ancestor(struct repository *r,
>  					    const char *name, int len,
>  					    struct object_id *result,
> @@ -1089,20 +1092,31 @@ static enum get_oid_result get_nth_ancestor(struct repository *r,
>  	struct commit *commit;
>  	int ret;
>  
> +	if (++get_nth_ancestor_curr_depth > get_nth_ancestor_max_depth)
> +		 return error(_("exceeded maximum ancestor depth"));
> +
>  	ret = get_oid_1(r, name, len, &oid, GET_OID_COMMITTISH);
>  	if (ret)
> -		return ret;
> +		goto done;
>  	commit = lookup_commit_reference(r, &oid);
> -	if (!commit)
> -		return MISSING_OBJECT;
> +	if (!commit) {
> +		ret = MISSING_OBJECT;
> +		goto done;
> +	}
>  
>  	while (generation--) {
> -		if (repo_parse_commit(r, commit) || !commit->parents)
> -			return MISSING_OBJECT;
> +		if (repo_parse_commit(r, commit) || !commit->parents) {
> +			ret = MISSING_OBJECT;
> +			goto done;
> +		}
>  		commit = commit->parents->item;
>  	}
>  	oidcpy(result, &commit->object.oid);
> -	return FOUND;
> +
> +	ret = FOUND;
> +done:
> +	get_nth_ancestor_curr_depth--;
> +	return ret;
>  }
>  
>  struct object *repo_peel_to_type(struct repository *r, const char *name, int namelen,
> diff --git a/t/t1506-rev-parse-diagnosis.sh b/t/t1506-rev-parse-diagnosis.sh
> index ef40511d89..b3b9f6c8c5 100755
> --- a/t/t1506-rev-parse-diagnosis.sh
> +++ b/t/t1506-rev-parse-diagnosis.sh
> @@ -244,6 +244,11 @@ test_expect_success 'reject Nth ancestor if N is too high' '
>  	test_must_fail git rev-parse HEAD~100000000000000000000000000000000
>  '
>  
> +test_expect_success 'reject too-deep recursive ancestor queries' '
> +	test_must_fail git rev-parse "HEAD$(perl -e "print \"~\" x 4097")" 2>err &&
> +	grep "error: exceeded maximum ancestor depth" err
> +'
> +
>  test_expect_success 'pathspecs with wildcards are not ambiguous' '
>  	echo "*.c" >expect &&
>  	git rev-parse "*.c" >actual &&
> 
> base-commit: cfb8a6e9a93adbe81efca66e6110c9b4d2e57169
> -- 
> 2.43.0.rc2.19.geadd45bf00

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux