Re: kernel selftest max_guest_memory_test fails when using more that 256 vCPUs

Andrew Jones <ajones@xxxxxxxxxxxxxxxx> · Tue, 12 Mar 2024 12:37:11 +0100

On Mon, Mar 11, 2024 at 05:05:18PM -0700, Sean Christopherson wrote:
> On Mon, Mar 11, 2024, mlevitsk@xxxxxxxxxx wrote:
> > Hi,
> > 
> > Recently I debugged a failure of this selftest and this is what is happening:
> > 
> > For each vCPU this test runs the guest till it does the ucall, then it resets
> > all the vCPU registers to their initial values (including RIP) and runs the guest again.
> > I don't know if this is needed.
> > 
> > What happens however is that ucall code allocates the ucall struct prior to calling the host,
> > and then expects the host to resume the guest, at which point the guest frees the struct.
> > 
> > However since the host manually resets the guest registers, the code that frees the ucall struct
> > is never reached and thus the ucall struct is leaked.
> > 
> > Currently ucall code has a pool of KVM_MAX_VCPUS (512) objects, thus if the test is run with more
> > than 256 vCPUs, the pool is exhausted and the test fails.
> > 
> > So either we need to:
> >   - add a way to manually free the ucall struct for such tests from the host side.
> 
> Part of me wants to do something along these lines, as every GUEST_DONE() and
> failed GUEST_ASSERT() is "leaking" a ucall structure.  But practically speaking,
> freeing a ucall structure from anywhere except the vCPU context is bound to cause
> more problems than it solves.

Yes, ideally the host could clobber guest registers, or do whatever it
likes, without having to consider how it impacts the guest's ability
to manage the test. I.e. the guest code should be more the "software
under test" than the "test software", but kvm selftests blurs the line
between test code and tested code all the time, so freeing ucall objects
is just one more case of that.

> 
> >   - remove the manual reset of the vCPUs register state from this test and
> >   instead put the guest code in while(1) {} loop.
> 
> Definitely this one.

I agree.

> IIRC, the only reason I stuffed registers in the test was
> because I was trying to force MMU reloads.  I can't think of any reason why a
> simple infinite loop in the guest wouldn't work.  I'm pretty sure this is all
> that's needed?
> 
> diff --git a/tools/testing/selftests/kvm/max_guest_memory_test.c b/tools/testing/selftests/kvm/max_guest_memory_test.c
> index 6628dc4dda89..5f9950f41313 100644
> --- a/tools/testing/selftests/kvm/max_guest_memory_test.c
> +++ b/tools/testing/selftests/kvm/max_guest_memory_test.c
> @@ -22,10 +22,12 @@ static void guest_code(uint64_t start_gpa, uint64_t end_gpa, uint64_t stride)
>  {
>         uint64_t gpa;
>  
> -       for (gpa = start_gpa; gpa < end_gpa; gpa += stride)
> -               *((volatile uint64_t *)gpa) = gpa;
> +       for (;;) {
> +               for (gpa = start_gpa; gpa < end_gpa; gpa += stride)
> +                       *((volatile uint64_t *)gpa) = gpa;
>  
> -       GUEST_DONE();
> +               GUEST_DONE();

I'd change this to a GUEST_SYNC(0), since the infinite loop otherwise
contradicts the "done-ness".

Thanks,
drew