This is a per-VM capability which can be enabled by userspace so that the faulting linear address will be included with the information about a pending #PF in L2, and the "new DR6 bits" will be included with the information about a pending #DB in L2. With this capability enabled, the L1 hypervisor can now intercept #PF before CR2 is modified. Under VMX, the L1 hypervisor can now intercept #DB before DR6 and DR7 are modified. When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should generally provide an appropriate payload when injecting a #PF or #DB exception via KVM_SET_VCPU_EVENTS. However, to support restoring old checkpoints, this payload is not required. Note that bit 16 of the "new DR6 bits" is set to indicate that a debug exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM region while advanced debugging of RTM transactional regions was enabled. This is the reverse of DR6.RTM, which is cleared in this scenario. Reported-by: Jim Mattson <jmattson@xxxxxxxxxx> Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx> Reviewed-by: Peter Shier <pshier@xxxxxxxxxx> --- Documentation/virtual/kvm/api.txt | 22 ++++++++++++++++++++++ arch/x86/kvm/x86.c | 5 +++++ include/uapi/linux/kvm.h | 1 + 3 files changed, 28 insertions(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 2df2cca81cf5..bb2b8bc0ffe0 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -4540,6 +4540,28 @@ With this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise, a #GP would be raised when the guest tries to access. Currently, this capability does not enable write permissions of this MSR for the guest. +7.15 KVM_CAP_EXCEPTION_PAYLOAD + +Architectures: x86 +Parameters: args[0] whether feature should be enabled or not + +With this capability enabled, CR2 will not be modified prior to the +emulated VM-exit when L1 intercepts a #PF exception that occurs in +L2. Similarly, for kvm-intel only, DR6 will not be modified prior to +the emulated VM-exit when L1 intercepts a #DB exception that occurs in +L2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or +#DB) exception for L2, exception.has_payload will be set and the +faulting address (or the new DR6 bits*) will be reported in the +exception_payload field. Similarly, when userspace injects a #PF (or +#DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set +exception.has_payload and to put the faulting address (or the new DR6 +bits*) in the exception_payload field. + +There is no change in behavior for exceptions that occur in L1. + +* For the new DR6 bits, note that bit 16 is set iff the #DB exception + will clear DR6.RTM. + 8. Other capabilities. ---------------------- diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 33e171e6d067..bcfcfa813c90 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2994,6 +2994,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_IMMEDIATE_EXIT: case KVM_CAP_GET_MSR_FEATURES: case KVM_CAP_MSR_PLATFORM_INFO: + case KVM_CAP_EXCEPTION_PAYLOAD: r = 1; break; case KVM_CAP_SYNC_REGS: @@ -4443,6 +4444,10 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, kvm->arch.guest_can_read_msr_platform_info = cap->args[0]; r = 0; break; + case KVM_CAP_EXCEPTION_PAYLOAD: + kvm->arch.exception_payload_enabled = cap->args[0]; + r = 0; + break; default: r = -EINVAL; break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 251be353f950..531da3d1fd55 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_NESTED_STATE 157 #define KVM_CAP_ARM_INJECT_SERROR_ESR 158 #define KVM_CAP_MSR_PLATFORM_INFO 159 +#define KVM_CAP_EXCEPTION_PAYLOAD 160 #ifdef KVM_CAP_IRQ_ROUTING -- 2.19.0.605.g01d371f741-goog