Re: [kvm-unit-tests PATCH 16/16] add IPI loss stress test

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 24 Oct 2022 17:19:44 +0000

On Mon, Oct 24, 2022, Maxim Levitsky wrote:
> On Thu, 2022-10-20 at 20:23 +0000, Sean Christopherson wrote:
> > On Thu, Oct 20, 2022, Maxim Levitsky wrote:
> > > +static void wait_for_ipi(volatile u64 *count)
> > > +{
> > > +       u64 old_count = *count;
> > > +       bool use_halt;
> > > +
> > > +       switch (hlt_allowed) {
> > > +       case -1:
> > > +               use_halt = get_random(0,10000) == 0;
> > 
> > Randomly doing "halt" is going to be annoying to debug.  What about tying the
> > this decision to the iteration and then providing a knob to let the user specify
> > the frequency?  It seems unlikely that this test will expose a bug that occurs
> > if and only if the halt path is truly random.
> 
> This is stress test, it is pretty much impossible to debug, it is more like
> pass/fail test.

There's a big difference between "hard to debug because there's a lot going on"
and "hard to debug because failures are intermittent due to use of random numbers
with no way to ensure a deterministic sequence.  I completely understand that this
type of test is going to be really hard to debug, but that's argument for making
the test as deterministic as possible, i.e. do what we can to make it slightly
less awful to debug.

> > > +                       asm volatile ("sti;nop;cli");
> > 
> > sti_nop_cli();
> I think you mean sti_nop(); cli();

I was thinking we could add another helper since it's such a common pattern.

> > > +
> > > +       } while (old_count == *count);
> > 
> > There's no need to loop in the use_halt case.  If KVM spuriously wakes the vCPU
> > from halt, then that's a KVM bug.  Kinda ugly, but it does provide meaningfully
> > coverage for the HLT case.
> 
> Nope - KVM does spuriously wake up the CPU, for example when the vCPU thread
> recieves a signal and anything else that makes the kvm_vcpu_check_block
> return -EINTR.

That doesn't (and shouldn't) wake the vCPU from the guest's perspective.  If/when
userspace calls KVM_RUN again, the vCPU's state should still be KVM_MP_STATE_HALTED
and thus KVM will invoke vcpu_block() until there is an actual wake event.

This is something that KVM _must_ get correct,

> > > +static void wait_for_ipi_in_l2(volatile u64 *count, struct svm_vcpu *vcpu)
> > > +{
> > > +       u64 old_count = *count;
> > > +       bool irq_on_vmentry = get_random(0,1) == 0;
> > 
> > Same concerns about using random numbers.
> 
> I can also add a parameter to force this to true/false, or better long term,
> is to provide a PRNG and just seed it with either RDRAND or a userspace given number.
> RDRAND retrived value can be even printed so that the test can be replayed.
> 
> You know just like the tools we both worked on at Intel did....
> 
> In fact I'll just do it - just need to pick some open source PRNG code.
> Do you happen to know a good one? Mersenne Twister? 

It probably makes sense to use whatever we end up using for selftests[*] in order
to minimize the code we have to maintain.

[*] https://lore.kernel.org/all/20221019221321.3033920-2-coltonlewis@xxxxxxxxxx

> > > +               // GIF is set by VMRUN
> > > +               SVM_VMRUN(vcpu->vmcb, &vcpu->regs);
> > > +               // GIF is cleared by VMEXIT
> > > +               asm volatile("cli;nop;stgi");
> > 
> > Why re-enable GIF on every exit?
> 
> And why not? KVM does this on each VMRUN.

Because doing work for no discernible reason is confusing.  E.g. if this were a
"real" hypervisor, it should also context switch CR2.

KVM enables STGI because GIF=0 blocks _all_ interrupts, i.e. KVM needs to recognize
NMI, SMI, #MC, etc... asap and even if KVM stays in its tight run loop.  For KUT,
there should be never be an NMI, SMI, #MC, etc... and so no need to enable GIF.

I suppose you could make the argument that the test should set GIF when running on
bare metal, but that's tenuous at best as SMI is the only event that isn't fatal to
the test.

> > > +
> > > +       printf("test started, waiting to end...\n");
> > > +
> > > +       while (cpus_active() > 1) {
> > > +
> > > +               unsigned long isr_count1, isr_count2;
> > > +
> > > +               isr_count1 = isr_counts[1];
> > > +               delay(5ULL*1000*1000*1000);
> > 
> > Please add a macro or two for nanoseconds/milliseconds/seconds or whatever this
> > expands to.
> 
> That is the problem - the delay is just in TSC freq units, and knowing TSC freq
> for some reason on x86 is next to impossible on AMD

Ah, delay() takes the number cycles.  Ugh.

We should fix that, e.g. use the CPUID-provided frequency when possible (KVM should
emulate this if it doesn't already), and then #define an arbitrary TSC frequency as
a fall back so that we can write readable code, e.g. 2.4Ghz is probably close enough
to work.

> > And why not have multi configs, e.g. to run with and without x2APIC?
> 
> Good idea as well, although I don't know if I want to slow down the kvm unit
> tests run too much.

We should add a way to flag and omit all "slow" tests, e.g. vmx_vmcs_shadow_test
takes an absurd amount of time and is uninteresting for the vast majority of changes.