On Mon, Nov 01, 2021, Maxim Levitsky wrote: > On Mon, 2021-11-01 at 16:43 +0100, Vitaly Kuznetsov wrote: > > Paolo Bonzini <pbonzini@xxxxxxxxxx> writes: > > > > > On 11/08/21 14:29, Maxim Levitsky wrote: > > > > Modify debug_regs test to create a pending interrupt > > > > and see that it is blocked when single stepping is done > > > > with KVM_GUESTDBG_BLOCKIRQ > > > > > > > > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> > > > > --- > > > > .../testing/selftests/kvm/x86_64/debug_regs.c | 24 ++++++++++++++++--- > > > > 1 file changed, 21 insertions(+), 3 deletions(-) > > > > > > I haven't looked very much at this, but the test fails. > > > > > > > Same here, > > > > the test passes on AMD but fails consistently on Intel: > > > > # ./x86_64/debug_regs > > ==== Test Assertion Failure ==== > > x86_64/debug_regs.c:179: run->exit_reason == KVM_EXIT_DEBUG && run->debug.arch.exception == DB_VECTOR && run->debug.arch.pc == target_rip && run->debug.arch.dr6 == target_dr6 > > pid=13434 tid=13434 errno=0 - Success > > 1 0x00000000004027c6: main at debug_regs.c:179 > > 2 0x00007f65344cf554: ?? ??:0 > > 3 0x000000000040294a: _start at ??:? > > SINGLE_STEP[1]: exit 8 exception 1 rip 0x402a25 (should be 0x402a27) dr6 0xffff4ff0 (should be 0xffff4ff0) > > > > (I know I'm late to the party). > > Well that is strange. It passes on my intel laptop. Just tested > (kvm/queue + qemu master, compiled today) :-( > > It fails on iteration 1 (and there is iteration 0) which I think means that we > start with RIP on sti, and get #DB on start of xor instruction first (correctly), > and then we get #DB again on start of xor instruction again? > > Something very strange. My laptop has i7-7600U. I haven't verified on hardware, but my guess is that this code in vmx_vcpu_run() /* When single-stepping over STI and MOV SS, we must clear the * corresponding interruptibility bits in the guest state. Otherwise * vmentry fails as it then expects bit 14 (BS) in pending debug * exceptions being set, but that's not correct for the guest debugging * case. */ if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) vmx_set_interrupt_shadow(vcpu, 0); interacts badly with APICv=1. It will kill the STI shadow and cause the IRQ in vmcs.GUEST_RVI to be recognized when it (micro-)architecturally should not. My head is going in circles trying to sort out what would actually happen. Maybe comment out that and/or disable APICv to see if either one makes the test pass?