Re: [PATCH] builtin/blame: fix out-of-bounds read with excessive `--abbrev`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Patrick,

On Thu, 9 Jan 2025, Patrick Steinhardt wrote:

> On Thu, Jan 09, 2025 at 11:49:43AM +0100, Johannes Schindelin wrote:
> > > diff --git a/builtin/blame.c b/builtin/blame.c
> > > index 867032e4c16878ffd56df8a73162b89ca4bd2694..ad91fe9e97f90625dd2708fbd44bf2dd24a337a6 100644
> > > --- a/builtin/blame.c
> > > +++ b/builtin/blame.c
> > > @@ -475,6 +475,8 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > >  		char ch;
> > >  		size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> > >  			the_hash_algo->hexsz : (size_t) abbrev;
> > > +		if (length > GIT_MAX_HEXSZ)
> > > +			length = GIT_MAX_HEXSZ;
> >
> > This causes a subtle change of behavior because there are a couple of
> > conditional code blocks between this change and the `printf()` call
> > decrease `length`, i.e. specifying values larger than the maximal hex size
> > causes potentially-desirable, different behavior (and think about
> > https://www.hyrumslaw.com/).
>
> Alternatively we can move this until after we have done the
> subtractions. Then we don't have to do weird gymnastics.

Or we can even avoid assiging a maximum altogether:

		if (length < GIT_MAX_HEXSZ)
			printf("%.*s", (int)length, hex);
		else
			printf("%s", hex);

Or be more consistent with Git's source code style which often prefers
ternaries, favoring succinctness over readability:

		printf("%.*s", (int)(length < GIT_MAX_HEXSZ ? length : GIT_MAX_HEXSZ), hex);

> > >  		if (opt & OUTPUT_COLOR_LINE) {
> > >  			if (cnt > 0) {
> > > @@ -505,7 +507,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> > >  			length--;
> > >  			putchar('?');
> > >  		}
> > > -		fwrite(hex, 1, length, stdout);
> > > +		printf("%.*s", (int)length, hex);
> > >  		if (opt & OUTPUT_ANNOTATE_COMPAT) {
> > >  			const char *name;
> > >  			if (opt & OUTPUT_SHOW_EMAIL)
> > > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > > index 0147de304b4d104cc7f05ea1f8d68f1a07ceb80d..fcaba8c11f7ede084e069eefd292f337e8396cb4 100755
> > > --- a/t/t8002-blame.sh
> > > +++ b/t/t8002-blame.sh
> > > @@ -126,6 +126,10 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> > >  	check_abbrev $hexsz --no-abbrev
> > >  '
> > >
> > > +test_expect_success 'blame --abbrev gets truncated' '
> > > +	check_abbrev 9000 --abbrev=$hexsz HEAD
> >
> > This is actually incorrect: it passes `--abbrev=$hexsz` instead of a value
> > that needs to be truncated.
>
> Oh dear. The test did manage to catch the bug, but thinking more about
> it that was only because my initial fix was broken.
>
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > index ad91fe9e97f9..5b4976835066 100644
> > --- a/builtin/blame.c
> > +++ b/builtin/blame.c
> > @@ -475,8 +475,13 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
> >  		char ch;
> >  		size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
> >  			the_hash_algo->hexsz : (size_t) abbrev;
> > -		if (length > GIT_MAX_HEXSZ)
> > -			length = GIT_MAX_HEXSZ;
> > +
> > +		/*
> > +		 * Leave enough space for ^, * and ? indicators (boundary,
> > +		 * unblamable, ignored).
> > +		 */
> > +		if (length > GIT_MAX_HEXSZ + 3)
> > +			length = GIT_MAX_HEXSZ + 3;
> >
> >  		if (opt & OUTPUT_COLOR_LINE) {
> >  			if (cnt > 0) {
>
> How about this instead?
>
>     diff --git a/builtin/blame.c b/builtin/blame.c
>     index ad91fe9e97..f92e487bed 100644
>     --- a/builtin/blame.c
>     +++ b/builtin/blame.c
>     @@ -475,8 +475,6 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
>             char ch;
>             size_t length = (opt & OUTPUT_LONG_OBJECT_NAME) ?
>                 the_hash_algo->hexsz : (size_t) abbrev;
>     -		if (length > GIT_MAX_HEXSZ)
>     -			length = GIT_MAX_HEXSZ;
>
>             if (opt & OUTPUT_COLOR_LINE) {
>                 if (cnt > 0) {
>     @@ -507,6 +505,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int
>                 length--;
>                 putchar('?');
>             }
>     +
>     +		if (length > GIT_MAX_HEXSZ)
>     +			length = GIT_MAX_HEXSZ;
>             printf("%.*s", (int)length, hex);
>             if (opt & OUTPUT_ANNOTATE_COMPAT) {
>                 const char *name;
>
> In that case there's no need to juggle with the magic indicators, which
> makes it a bit easier to reason about.
>
> > diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh
> > index fcaba8c11f7e..71fa70a64679 100755
> > --- a/t/t8002-blame.sh
> > +++ b/t/t8002-blame.sh
> > @@ -127,7 +127,7 @@ test_expect_success '--no-abbrev works like --abbrev with full length' '
> >  '
> >
> >  test_expect_success 'blame --abbrev gets truncated' '
> > -	check_abbrev 9000 --abbrev=$hexsz HEAD
> > +	check_abbrev 9000 --abbrev=9000 HEAD..
> >  '
>
> This should be `check_abbrev $hexsz --abbrev=9000`, shouldn't it?

I kind of liked the idea to keep the same cut-off threshold for the
validation. But I won't insist. The construct is obtuse in either case ;-)
(Not your fault, of course, you merely imitated existing code, which is
the correct thing to do).

Ciao,
Johannes





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux