Re: [PATCH] unit-tests: convert t/helper/test-oid-array.c to unit-tests

"Ghanshyam Thakkar" <shyamthakkar001@xxxxxxxxx> · Tue, 27 Feb 2024 00:41:24 +0530

On Mon Feb 26, 2024 at 8:41 PM IST, Christian Couder wrote:
> It might not be a good idea to start working on a GSoC project we
> propose (Move existing tests to a unit testing framework) right now.
> You can work on it as part of your GSoC application, to show an
> example of how you would do it, and we might review that as part of
> reviewing your application. But for such a project if many candidates
> started working on it and sent patches to the mailing list before they
> get selected, then the project might be nearly finished before the
> GSoC even starts.
>
> So I think it would be better to work on other things instead, like
> perhaps reviewing other people's work or working on other bug fixes or
> features. Anyway now that this is on the mailing list, I might as well
> review it as it could help with your application. But please consider
> working on other things.

I understand and will work on other things.

> > There is only one change in the new testing approach. In the previous
> > testing method, a new repo gets initialized for the test according to
> > GIT_TEST_DEFAULT_HASH algorithm.
>
> It looks like this happens in "t/test-lib-functions.sh", right?
> Telling a bit more about how and where that happens might help
> reviewers who would like to take a look.

Yeah, that is happens in "t/test-lib-functions.sh". I will update the
commit message to describe this better.
>
> > In unit testing however, we do not
> > need to initialize the repo. We can set the length of the hexadecimal
> > strbuf according to the algorithm used directly.
>
> So is your patch doing that or not? It might be better to be explicit.
> Also if 'strbuf's are used, then is it really worth it to set their
> length in advance, instead of just letting them grow to the right
> length as we add hex to them?

I thought of it like this: If we were to just let them grow, then we
would need separate logic for reusing that strbuf or use a different
one everytime since it always grows. By separating allocation
(hex_strbuf_init) and manipulation (fill_hex_strbuf), that same strbuf
can be reused for different hex values.

But, none of the test currently need to reuse the same strbuf, so I
suppose it is better to just let it grow and even if the need arises we
can use strbuf_splice().

> > Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@xxxxxxxxx>
> > ---
> > [RFC]: I recently saw a series by Eric W. Biederman [1] which enables
> > the use of oid's with different hash algorithms into the same
> > oid_array safely. However, there were no tests added for this. So, I
> > am wondering if we should have a input format which allows us to
> > specify hash algo for each oid with its hex value. i.e. "sha1:55" or
> > "sha256:55", instead of just "55" and relying on GIT_TEST_DEFAULT_HASH
> > for algo. So far, I tried to imitate the existing tests but I suppose
> > this may be useful in the future if that series gets merged.
>
> The fact that there is a series touching the same area might also hint
> that it might not be the right time to work on this.

I understand.

> > diff --git a/t/unit-tests/t-oid-array.c b/t/unit-tests/t-oid-array.c
> > new file mode 100644
> > index 0000000000..b4f43c025d
> > --- /dev/null
> > +++ b/t/unit-tests/t-oid-array.c
> > @@ -0,0 +1,222 @@
> > +#include "test-lib.h"
> > +#include "hex.h"
> > +#include "oid-array.h"
> > +#include "strbuf.h"
> > +
> > +#define INPUT "88", "44", "aa", "55"
> > +#define INPUT_DUP \
> > +       "88", "44", "aa", "55", "88", "44", "aa", "55", "88", "44", "aa", "55"
>
> Can you reuse INPUT in INPUT_DUP?

Yeah, that would be more clearer.

> > +#define INPUT_ONLY_DUP "55", "55"
> > +#define ENUMERATION_RESULT_SORTED "44", "55", "88", "aa"
> > +
> > +/*
> > + * allocates the memory based on the hash algorithm used and sets the length to
> > + * it.
> > + */
> > +static void hex_strbuf_init(struct strbuf *hex)
> > +{
> > +       static int sz = -1;
> > +
> > +       if (sz == -1) {
> > +               char *algo_env = getenv("GIT_TEST_DEFAULT_HASH");
> > +               if (algo_env && !strcmp(algo_env, "sha256"))
> > +                       sz = GIT_SHA256_HEXSZ;
> > +               else
> > +                       sz = GIT_SHA1_HEXSZ;
> > +       }
> > +
> > +       strbuf_init(hex, sz);
> > +       strbuf_setlen(hex, sz);
> > +}
>
> A strbuf can grow when we add stuff to it. We don't need to know its
> size in advance. So I am not sure this function is actually useful.

Yeah, this was mainly for deciding the hash algorithm but that logic can
be moved to fill_hex_strbuf.

> > +/* fills the hex strbuf with alternating characters from 'c' */
> > +static void fill_hex_strbuf(struct strbuf *hex, char *c)
> > +{
> > +       size_t i;
> > +       for (i = 0; i < hex->len; i++)
> > +               hex->buf[i] = (i & 1) ? c[1] : c[0];
>
> There is strbuf_addch() to add a single char to a strbuf, or
> strbuf_add() and strbuf_addstr() to add many chars at once.
Will update it.

Thank you for the feedback and review.