On Mon, Aug 26, 2019 at 3:04 PM Liran Alon <liran.alon@xxxxxxxxxx> wrote: > > > > > On 27 Aug 2019, at 0:17, Jim Mattson <jmattson@xxxxxxxxxx> wrote: > > Suppose that L0 just finished emulating an L2 instruction with > > EFLAGS.TF set. So, we've just queued up a #DB trap in > > vcpu->arch.exception. On this emulated VM-exit from L2 to L1, the > > guest pending debug exceptions field in the vmcs12 should get the > > value of vcpu->arch.exception.payload, and the queued #DB should be > > squashed. > > If I understand correctly you are discussing a case where L2 exited to L0 for > emulating some instruction when L2’s RFLAGS.TF is set. Therefore, after x86 > emulator finished emulating L2 instruction, it queued a #DB exception. Right. For example, L0 really likes to emulate L2's MOV-to-CR instructions. For added complication, what if the emulated instruction is in the shadow of a POPSS that triggered a data breakpoint? Then there will be existing pending debug exceptions in vmcs02 that need to be merged with DR6.BS. (This isn't anything new with your change, though.) > Then before resuming L2 guest, some other vCPU send an INIT signal > to this vCPU. When L0 will reach vmx_check_nested_events() it will > see pending INIT signal and exit on EXIT_REASON_INIT_SIGNAL > but nested_vmx_vmexit() will basically drop pending #DB exception > in prepare_vmcs12() when it calls kvm_clear_exception_queue() > because vmcs12_save_pending_event() only saves injected exceptions. > (As changed by myself a long time ago) > > I think you are right this is a bug. > I also think it could trivially be fixed by just making sure vmx_check_nested_events() > first evaluates pending exceptions and only then pending apic events. > However, we also want to make sure to request an “immediate-exit” in case > eval of pending exception require emulation of an exit from L2 to L1. Hmmm. Any exception other than a #DB trap should take precedence over the INIT. However, the INIT takes precedence over a #DB trap. But maybe you can fudge the ordering, because there is no way for the guest to tell which came first? What about a single-step trap on VMXOFF, with the INIT latched? In that case, the guest could tell that the INIT was latched before the VMXOFF, so the INIT must take precedence. Do we get that ordering right?