Re: [PATCH v6 1/3] KVM: selftests: implement random number generation for guest code

Colton Lewis <coltonlewis@xxxxxxxxxx> · Wed, 12 Oct 2022 21:11:26 +0000

Sean Christopherson <seanjc@xxxxxxxxxx> writes:

On Tue, Oct 11, 2022, Colton Lewis wrote:
Sean Christopherson <seanjc@xxxxxxxxxx> writes:
> Regardless of whether or not the details are gory, having to be aware  
of
> those details unnecessarily impedes understanding the code.  The vast,  
vast
> majority of folks that read this code won't care about how PRNGs work.
> Even if the reader is familiar with PRNGs, those details aren't at all
> relevant to understanding what the guest code does.  The reader only  
needs
> to know "oh, this is randomizing the address".  How the randomization  
works
> is completely irrelevant for that level of understanding.

It is relevant if the reader of the guest code cares about reentrancy
and thread-safety (good for such things as reproducing the same stream
of randoms from the same initial conditions), because they will have to
manage some state to make that work. Whether that state is an integer or
an opaque struct requires the same level of knowledge to use the API.

By "readers" I'm not (only) talking about developers actively writing  
code, I'm
I'm talking about people that run this test, encounter a failure, and  
read the code
to try and understand what went wrong.  For those people, knowing that  
the guest is
generating a random number is sufficient information during initial  
triage.  If the
failure ends up being related to the random number, then yes they'll  
likely need to
learn the details, but that's a few steps down the road.

Ok. That makes sense to me.

> > I agree returning the value could make it easier to use as part of
> > expressions, but is it that important?

> Yes, because in pretty much every use case, the random number is going  
to
> be immediately consumed.  Readability and robustness aside, returning  
the
> value cuts the amount of code need to generate and consume a random  
number
> in half.

Ok. This is trivial to change (or implement on top of what is
there). Would you be happy with something like the following?

uint32_t guest_random(uint32_t *seed)
{
	*seed = (uint64_t)*seed * 48271 % ((uint32_t)(1 << 31) - 1);
	return *seed;
}

Honestly, no.  I truly don't understand the pushback on throwing the seed  
into
a struct and giving the helpers more precise names.  E.g. without looking  
at the
prototype, guest_random() tells the reader nothing about the size of the  
random
number returned, whereas <namespace>_random_u32() or  
<namespace>_get_random_u32()
(if we want to more closely follow kernel nomenclature) tells the reader  
exactly
what is returned.

If that's what you care about, I'll wrap the state and put better
information in the names. I think I misunderstood your original comments
to mean you wanted something much more expansive than you really did and
I'm getting tired of redoing this patch.

The code is trivial to write and I can't think of any meaningful  
downside.  Worst
case scenario, we end up with an implementation that is slightly more  
formal than
then we really need.

As a matter of personal taste, I don't like the additional formality
making things look more complicated than they are. The stakes are small
here but that kind of extra boilerplate can add up to make things
confusing.

Thanks for your patience. I never wanted to cause trouble.