Re: [PATCH 3/9] test-mergesort: add test subcommand

René Scharfe <l.s.r@xxxxxx> · Sun, 3 Oct 2021 12:15:21 +0200

Am 02.10.21 um 10:35 schrieb Ævar Arnfjörð Bjarmason:
>
> On Fri, Oct 01 2021, Junio C Hamano wrote:
>
>> René Scharfe <l.s.r@xxxxxx> writes:
>>
>>> +static void dist_rand(int *arr, int n, int m)
>>> +{
>>> +	int i;
>>> +	for (i = 0; i < n; i++)
>>> +		arr[i] = rand() % m;
>>> +}
>>> ...
>>> +static void dist_shuffle(int *arr, int n, int m)
>>> +{
>>> +	int i, j, k;
>>> +	for (i = j = 0, k = 1; i < n; i++)
>>> +		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
>>> +}
>>
>> I briefly wondered if we want to seed the rand() in some way to make
>> the tests reproducible, but we'd need to ship our own rand() if we
>> wanted to go that route, which would probably be too much.
>
> Wouldn't calling srand() with some constant value suffice on most
> platforms? I'm aware of it being a NOOP and rand() always being randomly
> seeded on (IIRC) OpenBSD, but that should work on e.g. glibc.

Right, so we'd need to ship our own random number generator.

Repeatable tests are not essential (the original paper didn't mention
seeding), but shouldn't be much trouble to implement and would simplify
comparisons across versions, systems and among testers.

The only downside I can think of is that it may perhaps also simplify
over-fitting, i.e. I might find micro-tweaks that only work for our
specific rand() sequence and then misinterpret them as general
improvements..

René