Re: [PATCH v2 02/30] oid-array: teach oid-array to handle multiple kinds of oids

Linus Arver <linusa@xxxxxxxxxx> · Tue, 13 Feb 2024 00:16:34 -0800

"Eric W. Biederman" <ebiederm@xxxxxxxxx> writes:

> From: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>
> While looking at how to handle input of both SHA-1 and SHA-256 oids in
> get_oid_with_context, I realized that the oid_array in
> repo_for_each_abbrev might have more than one kind of oid stored in it
> simultaneously.
>
> Update to oid_array_append to ensure that oids added to an oid array

s/Update to/Update

> always have an algorithm set.
>
> Update void_hashcmp to first verify two oids use the same hash algorithm
> before comparing them to each other.
>
> With that oid-array should be safe to use with different kinds of

s/oid-array/oid_array

> oids simultaneously.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> ---
>  oid-array.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/oid-array.c b/oid-array.c
> index 8e4717746c31..1f36651754ed 100644
> --- a/oid-array.c
> +++ b/oid-array.c
> @@ -6,12 +6,20 @@ void oid_array_append(struct oid_array *array, const struct object_id *oid)
>  {
>  	ALLOC_GROW(array->oid, array->nr + 1, array->alloc);
>  	oidcpy(&array->oid[array->nr++], oid);
> +	if (!oid->algo)
> +		oid_set_algo(&array->oid[array->nr - 1], the_hash_algo);

How come we can't set oid->algo _before_ we call oidcpy()? It seems odd
that we do the copy first and then modify what we just copied after the
fact, instead of making sure that the thing we want to copy is correct
before doing the copy.

But also, if we are going to make the oid object "correct" before
invoking oidcpy(), we might as well do it when the oid is first
created/used (in the caller(s) of this function). I don't demand that
you find/demonstrate where all these places are in this series (maybe
that's a hairy problem to tackle?), but it seems cleaner in principle to
fix the creation of oid objects instead of having to make oid users
clean up their act like this after using them.

>  	array->sorted = 0;
>  }
>  
> -static int void_hashcmp(const void *a, const void *b)
> +static int void_hashcmp(const void *va, const void *vb)
>  {
> -	return oidcmp(a, b);
> +	const struct object_id *a = va, *b = vb;
> +	int ret;
> +	if (a->algo == b->algo)
> +		ret = oidcmp(a, b);

This makes sense (per the commit message description) ...

> +	else
> +		ret = a->algo > b->algo ? 1 : -1;

... but this seems to go against it? I thought you wanted to only ever
compare hashes if they were of the same algo? It would be good to add a
comment explaining why this is OK (we are no longer doing a byte-by-byte
comparison of these oids any more here like we do for oidcmp() above
which boils down to calling memcmp()).

> +	return ret;

Also, in terms of style I think the "early return for errors" style
would be simpler to read. I.e.

    if (a->algo > b->algo)
        return 1;

    if (a->algo < b->algo)
        return -1;

    return oidcmd(a, b);

>  }
>  
>  void oid_array_sort(struct oid_array *array)
> -- 
> 2.41.0