> On 8 Oct 2018, at 21:29, Jim Mattson <jmattson@xxxxxxxxxx> wrote: > > This is a per-VM capability which can be enabled by userspace so that > the faulting linear address will be included with the information > about a pending #PF in L2, and the "new DR6 bits" will be included > with the information about a pending #DB in L2. With this capability > enabled, the L1 hypervisor can now intercept #PF before CR2 is > modified. Under VMX, the L1 hypervisor can now intercept #DB before > DR6 and DR7 are modified. > > When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should > generally provide an appropriate payload when injecting a #PF or #DB > exception via KVM_SET_VCPU_EVENTS. However, to support restoring old > checkpoints, this payload is not required. > > Note that bit 16 of the "new DR6 bits" is set to indicate that a debug > exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM > region while advanced debugging of RTM transactional regions was > enabled. This is the reverse of DR6.RTM, which is cleared in this > scenario. > > Reported-by: Jim Mattson <jmattson@xxxxxxxxxx> > Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx> > Reviewed-by: Peter Shier <pshier@xxxxxxxxxx> > --- > Documentation/virtual/kvm/api.txt | 22 ++++++++++++++++++++++ > arch/x86/kvm/x86.c | 5 +++++ > include/uapi/linux/kvm.h | 1 + > 3 files changed, 28 insertions(+) > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index 2df2cca81cf5..bb2b8bc0ffe0 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -4540,6 +4540,28 @@ With this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise, > a #GP would be raised when the guest tries to access. Currently, this > capability does not enable write permissions of this MSR for the guest. > > +7.15 KVM_CAP_EXCEPTION_PAYLOAD > + > +Architectures: x86 > +Parameters: args[0] whether feature should be enabled or not > + > +With this capability enabled, CR2 will not be modified prior to the > +emulated VM-exit when L1 intercepts a #PF exception that occurs in > +L2. Similarly, for kvm-intel only, DR6 will not be modified prior to > +the emulated VM-exit when L1 intercepts a #DB exception that occurs in > +L2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or > +#DB) exception for L2, exception.has_payload will be set and the > +faulting address (or the new DR6 bits*) will be reported in the > +exception_payload field. Similarly, when userspace injects a #PF (or > +#DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set > +exception.has_payload and to put the faulting address (or the new DR6 > +bits*) in the exception_payload field. > + > +There is no change in behavior for exceptions that occur in L1. > + > +* For the new DR6 bits, note that bit 16 is set iff the #DB exception > + will clear DR6.RTM. > + > 8. Other capabilities. > ---------------------- > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 33e171e6d067..bcfcfa813c90 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2994,6 +2994,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_IMMEDIATE_EXIT: > case KVM_CAP_GET_MSR_FEATURES: > case KVM_CAP_MSR_PLATFORM_INFO: > + case KVM_CAP_EXCEPTION_PAYLOAD: > r = 1; > break; > case KVM_CAP_SYNC_REGS: > @@ -4443,6 +4444,10 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, > kvm->arch.guest_can_read_msr_platform_info = cap->args[0]; > r = 0; > break; > + case KVM_CAP_EXCEPTION_PAYLOAD: > + kvm->arch.exception_payload_enabled = cap->args[0]; > + r = 0; > + break; > default: > r = -EINVAL; > break; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 251be353f950..531da3d1fd55 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_NESTED_STATE 157 > #define KVM_CAP_ARM_INJECT_SERROR_ESR 158 > #define KVM_CAP_MSR_PLATFORM_INFO 159 > +#define KVM_CAP_EXCEPTION_PAYLOAD 160 > > #ifdef KVM_CAP_IRQ_ROUTING > > -- > 2.19.0.605.g01d371f741-goog > Patch itself looks fine: Reviewed-by: Liran Alon <liran.alon@xxxxxxxxxx> A couple of general notes regarding series: 1) I saw that kvm-unit-test 414bd9d5ebd7 ("x86: nVMX: Basic test of #DB intercept in L1”) verifies that indeed intercept on #DB is delivered before DR6 is modified. It would be nice to also have a kvm-unit-test that similarly verifies that intercept on #PF is delivered before CR2 is modified. 2) Similar kvm-unit-tests should also be written for nSVM. 3) I think now we have the needed framework to also fix kvm_vcpu_ioctl_x86_get_vcpu_events() and kvm_vcpu_ioctl_x86_set_vcpu_events() to pass exception.pending and exception.injected separately. Do you think this work should be done at the end of this patch series or a separate one once this is applied? -Liran