Hello! > commit a684d520ed62cf0db4495e5197d5bf722e4f8109 > Author: Peter Hornyack <peterhornyack@xxxxxxxxxx> > Date: Fri Dec 18 14:44:04 2015 -0800 > > KVM: add capabilities and exit reasons for MSRs. > > Define KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and > KVM_EXIT_MSR_AFTER_WRITE, new exit reasons for accesses to MSRs that kvm > does not handle or that user space needs to be notified about. Define the > KVM_CAP_MSR_EXITS, KVM_CAP_ENABLE_MSR_EXITS, and KVM_CAP_DISABLE_MSR_EXITS > capabilities to control these new exits for a VM. > > diff --git a/Documentation/virtual/kvm/api.txt > b/Documentation/virtual/kvm/api.txt > index 053f613fc9a9..3bba3248df3d 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -3359,6 +3359,43 @@ Hyper-V SynIC state change. Notification is > used to remap SynIC > event/message pages and to enable/disable SynIC messages/events processing > in userspace. > > + /* > + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, > + * KVM_EXIT_MSR_AFTER_WRITE > + */ > + struct { > + __u32 index; /* i.e. ecx; out */ > + __u64 data; /* out (wrmsr) / in (rdmsr) */ > +#define KVM_EXIT_MSR_COMPLETION_FAILED 1 > + __u64 type; /* out */ > +#define KVM_EXIT_MSR_UNHANDLED 0 > +#define KVM_EXIT_MSR_HANDLED 1 > + __u8 handled; /* in */ > + } msr; > + > +If exit_reason is KVM_EXIT_MSR_READ or KVM_EXIT_MSR_WRITE, then the vcpu has > +executed a rdmsr or wrmsr instruction which could not be satisfied by kvm. The > +msr struct is used for both output to and input from user space. index is the > +target MSR number held in ecx; user space must not modify this field. data In 'index', you meant? I would enlarge it to __u64 and use generalized encoding, the same as for KVM_SET_ONE_REG ioctl. I already wrote about it. And i would use simply "REG" instead of "MSR" denotion. Because on different architectures they can have different names (e. g. on ARM32 they are called "coprocessor registers" and on ARM64 these are "system registers"), however the common thing between them is that it is some special CPU register, access to which can be trapped and emulated. Therefore KVM_EXIT_REG_xxx. > +holds the payload from a wrmsr or must be filled in with a payload on a rdmsr. > +For a normal exit, type will be 0. > + > +On the return path into kvm, user space should set handled to > +KVM_EXIT_MSR_HANDLED if it successfully handled the MSR access. Otherwise, > +handled should be set to KVM_EXIT_MSR_UNHANDLED, which will cause a general > +protection fault to be injected into the vcpu. If an error occurs during the > +return into kvm, the vcpu will not be run and another exit will be generated > +with type set to KVM_EXIT_MSR_COMPLETION_FAILED. > + > +If exit_reason is KVM_EXIT_MSR_AFTER_WRITE, then the vcpu has executed a wrmsr > +instruction which is handled by kvm but which user space may need to be > +notified about. index and data are set as described above; the value of type > +depends on the MSR that was written. handled is ignored on reentry into kvm. 1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the same. 2. Why do WRITE and READ have to be different exit codes? We could use something like "u8 is_write" in our structure, this would be more in line with PIO and MMIO handling. > + > +KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and KVM_EXIT_MSR_AFTER_WRITE can only > +occur when KVM_CAP_MSR_EXITS is present and KVM_CAP_ENABLE_MSR_EXITS has been > +set. A detailed description of these capabilities is below. > + > /* Fix the size of the union. */ > char padding[256]; > }; > @@ -3697,6 +3734,26 @@ a KVM_EXIT_IOAPIC_EOI vmexit will be reported > to userspace. > Fails if VCPU has already been created, or if the irqchip is already in the > kernel (i.e. KVM_CREATE_IRQCHIP has already been called). > > +7.6 KVM_CAP_ENABLE_MSR_EXITS, KVM_CAP_DISABLE_MSR_EXITS > + > +Architectures: x86 (vmx-only) > +Parameters: none > +Returns: 0 on success, -1 on error > + > +These capabilities enable and disable exits to user space for certain guest MSR > +accesses. These capabilities are only available if KVM_CHECK_EXTENSION > +indicates that KVM_CAP_MSR_EXITS is present. > + > +When enabled, kvm will exit to user space when the guest reads > +an MSR that kvm does not handle (KVM_EXIT_MSR_READ), writes an MSR that kvm > +does not handle (KVM_EXIT_MSR_WRITE), or writes an MSR that kvm handles but for > +which user space should be notified (KVM_EXIT_MSR_AFTER_WRITE). > + > +These exits are currently only implemented for vmx. Also, note that if the kvm > +module's ignore_msrs flag is set then KVM_EXIT_MSR_READ and KVM_EXIT_MSR_WRITE > +will not be generated, and unhandled MSR accesses will simply be ignored and > +the guest re-entered immediately. > + > > 8. Other capabilities. > ---------------------- > @@ -3726,3 +3783,11 @@ In order to use SynIC, it has to be activated > by setting this > capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this > will disable the use of APIC hardware virtualization even if supported > by the CPU, as it's incompatible with SynIC auto-EOI behavior. > + > +8.3 KVM_CAP_MSR_EXITS > + > +Architectures: x86 (vmx-only) > + > +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means > +that the kernel implements the KVM_CAP_ENABLE_MSR_EXITS and > +KVM_CAP_DISABLE_MSR_EXITS capabilities for VMs. > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 6e32f7599081..431fd1ec0d06 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -199,6 +199,9 @@ struct kvm_hyperv_exit { > #define KVM_EXIT_S390_STSI 25 > #define KVM_EXIT_IOAPIC_EOI 26 > #define KVM_EXIT_HYPERV 27 > +#define KVM_EXIT_MSR_READ 28 > +#define KVM_EXIT_MSR_WRITE 29 > +#define KVM_EXIT_MSR_AFTER_WRITE 30 > > /* For KVM_EXIT_INTERNAL_ERROR */ > /* Emulate instruction failed. */ > @@ -355,6 +358,18 @@ struct kvm_run { > } eoi; > /* KVM_EXIT_HYPERV */ > struct kvm_hyperv_exit hyperv; > + /* > + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, > + * KVM_EXIT_MSR_AFTER_WRITE > + */ > + struct { > + __u32 index; /* i.e. ecx; out */ > + __u64 data; /* out (wrmsr) / in (rdmsr) */ > + __u64 type; /* out */ > +#define KVM_EXIT_MSR_UNHANDLED 0 > +#define KVM_EXIT_MSR_HANDLED 1 > + __u8 handled; /* in */ > + } msr; > /* Fix the size of the union. */ > char padding[256]; > }; > @@ -849,6 +864,9 @@ struct kvm_ppc_smmu_info { > #define KVM_CAP_SPLIT_IRQCHIP 121 > #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122 > #define KVM_CAP_HYPERV_SYNIC 123 > +#define KVM_CAP_MSR_EXITS 124 > +#define KVM_CAP_DISABLE_MSR_EXITS 125 > +#define KVM_CAP_ENABLE_MSR_EXITS 126 > > #ifdef KVM_CAP_IRQ_ROUTING Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html