On Fri, Jul 28, 2023, Michal Luczaj wrote: > Attempt to modify vcpu->run->s.regs _after_ the sanity checks performed by > KVM_CAP_SYNC_REGS's arch/x86/kvm/x86.c:sync_regs(). This could lead to some > nonsensical vCPU states accompanied by kernel splats. > > Signed-off-by: Michal Luczaj <mhal@xxxxxxx> > --- > .../selftests/kvm/x86_64/sync_regs_test.c | 124 ++++++++++++++++++ > 1 file changed, 124 insertions(+) > > diff --git a/tools/testing/selftests/kvm/x86_64/sync_regs_test.c b/tools/testing/selftests/kvm/x86_64/sync_regs_test.c > index 2da89fdc2471..feebc7d44c17 100644 > --- a/tools/testing/selftests/kvm/x86_64/sync_regs_test.c > +++ b/tools/testing/selftests/kvm/x86_64/sync_regs_test.c > @@ -15,12 +15,14 @@ > #include <stdlib.h> > #include <string.h> > #include <sys/ioctl.h> > +#include <pthread.h> > > #include "test_util.h" > #include "kvm_util.h" > #include "processor.h" > > #define UCALL_PIO_PORT ((uint16_t)0x1000) > +#define TIMEOUT 2 /* seconds, roughly */ I think it makes sense to make this a const in race_sync_regs(), that way its usage is a bit more obvious. > struct ucall uc_none = { > .cmd = UCALL_NONE, > @@ -80,6 +82,124 @@ static void compare_vcpu_events(struct kvm_vcpu_events *left, > #define TEST_SYNC_FIELDS (KVM_SYNC_X86_REGS|KVM_SYNC_X86_SREGS|KVM_SYNC_X86_EVENTS) > #define INVALID_SYNC_FIELD 0x80000000 > > +/* > + * WARNING: CPU: 0 PID: 1115 at arch/x86/kvm/x86.c:10095 kvm_check_and_inject_events+0x220/0x500 [kvm] > + * > + * arch/x86/kvm/x86.c:kvm_check_and_inject_events(): > + * WARN_ON_ONCE(vcpu->arch.exception.injected && > + * vcpu->arch.exception.pending); > + */ For comments in selftests, describe what's happening without referencing KVM code, things like this in particular will become stale sooner than later. It's a-ok (and encouraged) to put the WARNs and function references in changelogs though, as those are explicitly tied to a specific time in history. > +static void race_sync_regs(void *racer, bool poke_mmu) > +{ > + struct kvm_translation tr; > + struct kvm_vcpu *vcpu; > + struct kvm_run *run; > + struct kvm_vm *vm; > + pthread_t thread; > + time_t t; > + > + vm = vm_create_with_one_vcpu(&vcpu, guest_code); > + run = vcpu->run; > + > + run->kvm_valid_regs = KVM_SYNC_X86_SREGS; > + vcpu_run(vcpu); > + TEST_REQUIRE(run->s.regs.sregs.cr4 & X86_CR4_PAE); This can be an assert, and should also check EFER.LME. Jump-starting in long mode is a property of selftests, i.e. not something that should ever randomly "fail". > + run->kvm_valid_regs = 0; > + > + ASSERT_EQ(pthread_create(&thread, NULL, racer, (void *)run), 0); > + > + for (t = time(NULL) + TIMEOUT; time(NULL) < t;) { > + __vcpu_run(vcpu); > + > + if (poke_mmu) { Rather than pass a boolean, I think it makes sense to do if (racer == race_sregs_cr4) It's arguably just trading ugliness for subtlety, but IMO it's worth avoiding the boolean. > + tr = (struct kvm_translation) { .linear_address = 0 }; > + __vcpu_ioctl(vcpu, KVM_TRANSLATE, &tr); > + } > + } > + > + ASSERT_EQ(pthread_cancel(thread), 0); > + ASSERT_EQ(pthread_join(thread, NULL), 0); > + > + /* > + * If kvm->bugged then we won't survive TEST_ASSERT(). Leak. > + * > + * kvm_vm_free() > + * __vm_mem_region_delete() > + * vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION, ®ion->region) > + * _vm_ioctl(vm, cmd, #cmd, arg) > + * TEST_ASSERT(!ret, __KVM_IOCTL_ERROR(name, ret)) > + */ We want the assert, it makes failures explicit. The signature is a bit unfortunate, but the WARN in the kernel log should provide a big clue. > + if (!poke_mmu) > + kvm_vm_free(vm); > +} > + > int main(int argc, char *argv[]) > { > struct kvm_vcpu *vcpu; > @@ -218,5 +338,9 @@ int main(int argc, char *argv[]) > > kvm_vm_free(vm); > > + race_sync_regs(race_sregs_cr4, true); > + race_sync_regs(race_events_exc, false); > + race_sync_regs(race_events_inj_pen, false); I'll fix up all of the above when applying, and will also split this into three patches, mostly so that each splat can be covered in a changelog, i.e. is tied to its testcase.