Re: [PATCH v2] builtin/blame: fix out-of-bounds read with excessive `--abbrev`

shejialuo <shejialuo@xxxxxxxxx> · Thu, 9 Jan 2025 20:40:23 +0800

On Thu, Jan 09, 2025 at 12:48:22PM +0100, Patrick Steinhardt wrote:
> In 6411a0a896 (builtin/blame: fix type of `length` variable when
> emitting object ID, 2024-12-06) we have fixed the type of the `length`
> variable. In order to avoid a cast from `size_t` to `int` in the call to
> printf(3p) with the "%.*s" formatter we have converted the code to
> instead use fwrite(3p), which accepts the length as a `size_t`.
> 
> It was reported though that this makes us read over the end of the OID
> array when the provided `--abbrev=` length exceeds the length of the
> object ID. This is because fwrite(3p) of course doesn't stop when it
> sees a NUL byte, whereas printf(3p) does.
> 
> Fix the bug by reverting back to printf(3p) and culling the provided
> length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
> `int`.
> 
> Reported-by: Johannes Schindelin <Johannes.Schindelin@xxxxxx>
> Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
> ---
> This fixes the issue reported in [1]. Thanks!
> 
> Changes in v2:
> 
>   - Take into account that we may strip ^, * and ? indicators by moving
>     around the check.
>   - Fix the testcase so that it actually fails without the fix.
>   - Link to v1: https://lore.kernel.org/r/20250109-b4-pks-blame-truncate-hash-length-v1-1-9ad4bb09e059@xxxxxx
> 
> Patrick
> 
> [1]: <4d812802-afbc-4635-7a19-73896fcda625@xxxxxx>
> ---
>  builtin/blame.c  | 5 ++++-
>  t/t8002-blame.sh | 4 ++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 867032e4c16878ffd56df8a73162b89ca4bd2694..f92e487bed22eec576a4716f2e654cb61efb9903 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -505,7 +505,10 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
>  			length--;
>  			putchar('?');
>  		}
> -		fwrite(hex, 1, length, stdout);
> +
> +		if (length > GIT_MAX_HEXSZ)
> +			length = GIT_MAX_HEXSZ;

Here, we explicitly set the length if it exceeds the max hex of the
repo. Although `printf` will automatically hide the length, we should
explicitly do this. Make sense.

> +		printf("%.*s", (int)length, hex);
>  		if (opt & OUTPUT_ANNOTATE_COMPAT) {
>  			const char *name;
>  			if (opt & OUTPUT_SHOW_EMAIL)
> diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..7cf6e0253a5bbd4d6e438e627dc18b47eac4df66 100755
> --- a/t/t8002-blame.sh
> +++ b/t/t8002-blame.sh
> @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
>  	check_abbrev $hexsz --no-abbrev
>  '
>  
> +test_expect_success 'blame --abbrev gets truncated' '
> +	check_abbrev $hexsz --abbrev=9000 HEAD
> +'
> +

By the way, I feel this usage is a little strange as the user side. When
I received the report mail from Johannes today morning, I feel a little
funny that we allow the value of the `--abrrev` option exceeds the
`GIT_MAX_HEXSZ` in the first place.

>  test_expect_success '--exclude-promisor-objects does not BUG-crash' '
>  	test_must_fail git blame --exclude-promisor-objects one
>  '
> 
> ---
> base-commit: 14650065b76b28d3cfa9453356ac5669b19e706e
> change-id: 20250109-b4-pks-blame-truncate-hash-length-c875cac66d71
> 

Thanks,
Jialuo