v1: http://lore.kernel.org/r/20200113221053.22053-1-oupton@xxxxxxxxxx v2: http://lore.kernel.org/r/20200128092715.69429-1-oupton@xxxxxxxxxx v1 => v2: - Don't split the #DB delivery by vendors. Unconditionally injecting #DB payloads into the 'pending debug exceptions' field will cause KVM to get stuck in a loop. Per the SDM, when hardware injects an event resulting from this field's value, it is checked against the exception interception bitmap. - Address Sean's comments by injecting the VM-exit into L1 from vmx_check_nested_events(). - Added fix for nested INIT VM-exits + 'pending debug exceptions' field as it was noticed in implementing v2. - Drop Peter + Jim's Reviewed-by tags, as the patch set has changed since v1. v2 => v3: - Merge the check/set_pending_dbg helpers into a single helper, vmx_update_pending_dbg(). Add clarifying comment to this helper. - Rewrite commit message, descriptive comment for change in 3/5 to explicitly describe the reason for mutating payload delivery behavior. - Undo the changes to kvm_vcpu_do_singlestep(). Instead, add a new hook to call for 'full' instruction emulation + 'fast' emulation. KVM already provides guests the ability to use the 'monitor trap flag' VM-execution control. Support for this flag is provided by the fact that KVM unconditionally forwards MTF VM-exits to the guest (if requested), as KVM doesn't utilize MTF. While this provides support during hardware instruction execution, it is insufficient for instruction emulation. Should L0 emulate an instruction on the behalf of L2, L0 should also synthesize an MTF VM-exit into L1, should control be set. The first patch corrects a nuanced difference between the definition of a #DB exception payload field and DR6 register. Mask off bit 12 which is defined in the 'pending debug exceptions' field when applying to DR6, since the payload field is said to be compatible with the aforementioned VMCS field. The second patch sets the 'pending debug exceptions' VMCS field when delivering an INIT signal VM-exit to L1, as described in the SDM. This patch also introduces helpers for setting the 'pending debug exceptions' VMCS field. The third patch massages KVM's handling of exception payloads with regard to API compatibility. Rather than immediately delivering the payload w/o opt-in, instead defer the payload + inject before completing a KVM_GET_VCPU_EVENTS. This maintains API compatibility whilst correcting #DB behavior with regard to higher priority VM-exit events. Fourth patch introduces MTF implementation for emulated instructions. Identify if an MTF is due on an instruction boundary from kvm_vcpu_do_singlestep(), however only deliver this VM-exit from vmx_check_nested_events() to respect the relative prioritization to other VM-exits. Since this augments the nested state, introduce a new flag for (de)serialization. Last patch adds tests to kvm-unit-tests to assert the correctness of MTF under several conditions (concurrent #DB trap, #DB fault, etc). These tests pass under virtualization with this series as well as on bare-metal. Based on commit 2c2787938512 ("KVM: selftests: Stop memslot creation in KVM internal memslot region"). Oliver Upton (4): KVM: x86: Mask off reserved bit from #DB exception payload KVM: nVMX: Handle pending #DB when injecting INIT VM-exit KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS KVM: nVMX: Emulate MTF when performing instruction emulation arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/svm.c | 1 + arch/x86/kvm/vmx/nested.c | 54 ++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/nested.h | 5 +++ arch/x86/kvm/vmx/vmx.c | 37 +++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 3 ++ arch/x86/kvm/x86.c | 39 ++++++++++++++++-------- 8 files changed, 126 insertions(+), 15 deletions(-) -- 2.25.0.341.g760bfbb309-goog