I'm happy to do the kvm-unit-tests for (1) and (2). The subtlety of exception.pending and exception.injected is lost on me. We do need to handle pending debug exceptions in a MOV-SS shadow, but I don't think that's what you're talking about. Can you explain? On Tue, Oct 9, 2018 at 5:33 AM, Liran Alon <liran.alon@xxxxxxxxxx> wrote: > > >> On 8 Oct 2018, at 21:29, Jim Mattson <jmattson@xxxxxxxxxx> wrote: >> >> This is a per-VM capability which can be enabled by userspace so that >> the faulting linear address will be included with the information >> about a pending #PF in L2, and the "new DR6 bits" will be included >> with the information about a pending #DB in L2. With this capability >> enabled, the L1 hypervisor can now intercept #PF before CR2 is >> modified. Under VMX, the L1 hypervisor can now intercept #DB before >> DR6 and DR7 are modified. >> >> When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should >> generally provide an appropriate payload when injecting a #PF or #DB >> exception via KVM_SET_VCPU_EVENTS. However, to support restoring old >> checkpoints, this payload is not required. >> >> Note that bit 16 of the "new DR6 bits" is set to indicate that a debug >> exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM >> region while advanced debugging of RTM transactional regions was >> enabled. This is the reverse of DR6.RTM, which is cleared in this >> scenario. >> >> Reported-by: Jim Mattson <jmattson@xxxxxxxxxx> >> Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx> >> Reviewed-by: Peter Shier <pshier@xxxxxxxxxx> >> --- >> Documentation/virtual/kvm/api.txt | 22 ++++++++++++++++++++++ >> arch/x86/kvm/x86.c | 5 +++++ >> include/uapi/linux/kvm.h | 1 + >> 3 files changed, 28 insertions(+) >> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt >> index 2df2cca81cf5..bb2b8bc0ffe0 100644 >> --- a/Documentation/virtual/kvm/api.txt >> +++ b/Documentation/virtual/kvm/api.txt >> @@ -4540,6 +4540,28 @@ With this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise, >> a #GP would be raised when the guest tries to access. Currently, this >> capability does not enable write permissions of this MSR for the guest. >> >> +7.15 KVM_CAP_EXCEPTION_PAYLOAD >> + >> +Architectures: x86 >> +Parameters: args[0] whether feature should be enabled or not >> + >> +With this capability enabled, CR2 will not be modified prior to the >> +emulated VM-exit when L1 intercepts a #PF exception that occurs in >> +L2. Similarly, for kvm-intel only, DR6 will not be modified prior to >> +the emulated VM-exit when L1 intercepts a #DB exception that occurs in >> +L2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or >> +#DB) exception for L2, exception.has_payload will be set and the >> +faulting address (or the new DR6 bits*) will be reported in the >> +exception_payload field. Similarly, when userspace injects a #PF (or >> +#DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set >> +exception.has_payload and to put the faulting address (or the new DR6 >> +bits*) in the exception_payload field. >> + >> +There is no change in behavior for exceptions that occur in L1. >> + >> +* For the new DR6 bits, note that bit 16 is set iff the #DB exception >> + will clear DR6.RTM. >> + >> 8. Other capabilities. >> ---------------------- >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 33e171e6d067..bcfcfa813c90 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -2994,6 +2994,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) >> case KVM_CAP_IMMEDIATE_EXIT: >> case KVM_CAP_GET_MSR_FEATURES: >> case KVM_CAP_MSR_PLATFORM_INFO: >> + case KVM_CAP_EXCEPTION_PAYLOAD: >> r = 1; >> break; >> case KVM_CAP_SYNC_REGS: >> @@ -4443,6 +4444,10 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, >> kvm->arch.guest_can_read_msr_platform_info = cap->args[0]; >> r = 0; >> break; >> + case KVM_CAP_EXCEPTION_PAYLOAD: >> + kvm->arch.exception_payload_enabled = cap->args[0]; >> + r = 0; >> + break; >> default: >> r = -EINVAL; >> break; >> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h >> index 251be353f950..531da3d1fd55 100644 >> --- a/include/uapi/linux/kvm.h >> +++ b/include/uapi/linux/kvm.h >> @@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt { >> #define KVM_CAP_NESTED_STATE 157 >> #define KVM_CAP_ARM_INJECT_SERROR_ESR 158 >> #define KVM_CAP_MSR_PLATFORM_INFO 159 >> +#define KVM_CAP_EXCEPTION_PAYLOAD 160 >> >> #ifdef KVM_CAP_IRQ_ROUTING >> >> -- >> 2.19.0.605.g01d371f741-goog >> > > Patch itself looks fine: > Reviewed-by: Liran Alon <liran.alon@xxxxxxxxxx> > > A couple of general notes regarding series: > 1) I saw that kvm-unit-test 414bd9d5ebd7 ("x86: nVMX: Basic test of #DB intercept in L1”) > verifies that indeed intercept on #DB is delivered before DR6 is modified. > It would be nice to also have a kvm-unit-test that similarly verifies that intercept on #PF is delivered before CR2 is modified. > 2) Similar kvm-unit-tests should also be written for nSVM. > 3) I think now we have the needed framework to also fix > kvm_vcpu_ioctl_x86_get_vcpu_events() and kvm_vcpu_ioctl_x86_set_vcpu_events() > to pass exception.pending and exception.injected separately. > Do you think this work should be done at the end of this patch series or a separate one once this is applied? > > -Liran > >