Re: [PATCH] connect: also update offset for features without values

Taylor Blau <me@xxxxxxxxxxxx> · Sat, 18 Sep 2021 11:53:00 -0400

Hi Andrzej,

On Sat, Sep 18, 2021 at 01:14:32PM +0000, Andrzej Hunt via GitGitGadget wrote:
> From: Andrzej Hunt <andrzej@xxxxxxxxx>

Thanks for writing this patch. I have seen a copy of this on the
security list, but the modified version here looks good to me, too. I
left a few notes throughout.

Recapping our discussion on the security list, we decided that this
didn't merit an embargoed release because a misbehaving server can still
cause a client to hang if it simply printed half of its ref
advertisement. So this issue isn't new, but fixing this instance of it
is good nonetheless.

> parse_feature_value() does not update offset if the feature being
> searched for does not specify a value. A loop that uses
> parse_feature_value() to find a feature which was specified without a
> value therefore might never exit (such loops will typically use
> next_server_feature_value() as opposed to parse_feature_value() itself).
> This usually isn't an issue: there's no point in using
> next_server_feature_value() to search for repeated instances of the same
> capability unless that capability typically specifies a value - but a
> broken server could send a response that omits the value for a feature
> even when we are expecting a value.

It may be worth adding a little detail here. parse_feature_value takes
an offset, and uses it to seek past the point in features_list that
we've already seen. But if we get a value-less feature, then offset is
never updated, and we'll keep parsing the same thing over and over in a
loop.

(I know that you know all of that, but I think it is worth spelling out
a little more clearly in the patch message).

> Therefore we add an offset update calculation for the no-value case,
> which helps ensure that loops using next_server_feature_value() will
> always terminate.

> next_server_feature_value(), and the offset calculation, were first
> added in 2.28 in:
>   2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25)

This line wrapping is a little odd, but not a big deal.

>
> Thanks to Peff for authoring the test.
>
> Co-authored-by: Jeff King <peff@xxxxxxxx>
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> Signed-off-by: Andrzej Hunt <andrzej@xxxxxxxxx>
> ---
>     connect: also update offset for features without values
>
>     This is a small patch to avoid an infinite loop which can occur when a
>     broken server forgets to include a value when specifying symref in the
>     capabilities list.
>
>     Thanks to Peff for writing the test.
>
>     Note: I modified the test by adding and object-format=... to the
>     injected server response, because the oid that we're using is the
>     default hash (which will be e.g. sha256 for some CI jobs), but our
>     protocol handler assumes sha1 unless a different hash has been
>     explicitly specified. I'm open to alternative suggestions.
>
>     ATB,
>
>     Andrzej
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1091%2Fahunt%2Fconnectloop-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1091/ahunt/connectloop-v1
> Pull-Request: https://github.com/git/git/pull/1091
>
>  connect.c                      |  2 ++
>  t/t5704-protocol-violations.sh | 13 +++++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/connect.c b/connect.c
> index aff13a270e6..eaf7d6d2618 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -557,6 +557,8 @@ const char *parse_feature_value(const char *feature_list, const char *feature, i
>  			if (!*value || isspace(*value)) {
>  				if (lenp)
>  					*lenp = 0;
> +				if (offset)
> +					*offset = found + len - feature_list;

The critical piece :-). Since feature_list is a superset of found, this
is perfectly safe. It calculates first the offset of the found string
within feature_list, and then adds the length of the feature name.

I would have found this easier to read if it were spelled out as:

    *offset = found - features_list + len;

which is the same thing but follows the order of how I spelled out this
expression in English. But the way you wrote it matches how
parse_feature_value() sets the offset when there is a value, so I think
it's worth being consistent with that.

> diff --git a/t/t5704-protocol-violations.sh b/t/t5704-protocol-violations.sh
> index 5c941949b98..34538cebf01 100755
> --- a/t/t5704-protocol-violations.sh
> +++ b/t/t5704-protocol-violations.sh
> @@ -32,4 +32,17 @@ test_expect_success 'extra delim packet in v2 fetch args' '
>  	test_i18ngrep "expected flush after fetch arguments" err
>  '
>
> +test_expect_success 'bogus symref in v0 capabilities' '
> +	test_commit foo &&
> +	oid=$(git rev-parse HEAD) &&
> +	{
> +		printf "%s HEAD\0symref object-format=%s\n" "$oid" "$GIT_DEFAULT_HASH" |
> +			test-tool pkt-line pack-raw-stdin &&

I'm actually really happy with this modification to add the non-empty
object-format after the broken "symref" part, since it ensures that your
offset calculation is right (and that we can continue to parse features
with or without values after a value-less one).

> +		printf "0000"
> +	} >input &&
> +	git ls-remote --upload-pack="cat input ;:" . >actual &&
> +	printf "%s\tHEAD\n" "$oid" >expect &&
> +	test_cmp expect actual
> +'

Looks great to me.

Thanks,
Taylor