Re: [PATCH 1/6] KVM: x86: Add ioctl for accepting a userspace provided MSR list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 7, 2020 at 9:12 AM Alexander Graf <graf@xxxxxxxxxx> wrote:
>
>
>
> On 04.08.20 06:20, Aaron Lewis wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >
> >
> >
> > Add KVM_SET_EXIT_MSRS ioctl to allow userspace to pass in a list of MSRs
> > that force an exit to userspace when rdmsr or wrmsr are used by the
> > guest.
> >
> > KVM_SET_EXIT_MSRS will need to be called before any vCPUs are
> > created to protect the 'user_exit_msrs' list from being mutated while
> > vCPUs are running.
> >
> > Add KVM_CAP_SET_MSR_EXITS to identify the feature exists.
> >
> > Signed-off-by: Aaron Lewis <aaronlewis@xxxxxxxxxx>
> > Reviewed-by: Oliver Upton <oupton@xxxxxxxxxx>
> > ---
> >   Documentation/virt/kvm/api.rst  | 24 +++++++++++++++++++
> >   arch/x86/include/asm/kvm_host.h |  2 ++
> >   arch/x86/kvm/x86.c              | 41 +++++++++++++++++++++++++++++++++
> >   include/uapi/linux/kvm.h        |  2 ++
> >   4 files changed, 69 insertions(+)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 320788f81a05..7d8167c165aa 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -1006,6 +1006,30 @@ such as migration.
> >   :Parameters: struct kvm_vcpu_event (out)
> >   :Returns: 0 on success, -1 on error
> >
> > +4.32 KVM_SET_EXIT_MSRS
> > +------------------
> > +
> > +:Capability: KVM_CAP_SET_MSR_EXITS
> > +:Architectures: x86
> > +:Type: vm ioctl
> > +:Parameters: struct kvm_msr_list (in)
> > +:Returns: 0 on success, -1 on error
> > +
> > +Sets the userspace MSR list which is used to track which MSRs KVM should send
> > +to userspace to be serviced when the guest executes rdmsr or wrmsr.
>
> Unfortunately this doesn't solve the whole "ignore_msrs" mess that we
> have today. How can I just say "tell user space about unhandled MSRs"?
> And if I add that on top of this mechanism, how do we not make the list
> of MSRs that are handled in-kernel an ABI?

Jumping in for Aaron, who is out this afternoon...

This patch doesn't attempt to solve your problem, "tell user space
about unhandled MSRs." It attempts to solve our problem, "tell user
space about a specific set of MSRs, even if kvm learns to handle them
in the future." This is, in fact, what we really wanted to do when
Peter Hornyack implemented the "tell user space about unhandled MSRs"
changes in 2015. We just didn't realize it at the time.

Though your proposal does partially solve our problem, it's a lot
easier to specify a small set of MSRs by inclusion than it is by
exclusion. (Where your proposal falls short of our needs is when
userspace wants to handle an MSR that kvm would not typically
intercept at all.)

> > +
> > +This ioctl needs to be called before vCPUs are setup otherwise the list of MSRs
> > +will not be accepted and an EINVAL error will be returned.  Also, if a list of
> > +MSRs has already been supplied, and this ioctl is called again an EEXIST error
> > +will be returned.
> > +
> > +::
> > +
> > +  struct kvm_msr_list {
> > +  __u32 nmsrs;
> > +  __u32 indices[0];
> > +};
> > +
> >   X86:
> >   ^^^^
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index be5363b21540..510055471dd0 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1004,6 +1004,8 @@ struct kvm_arch {
> >
> >          struct kvm_pmu_event_filter *pmu_event_filter;
> >          struct task_struct *nx_lpage_recovery_thread;
> > +
> > +       struct kvm_msr_list *user_exit_msrs;
> >   };
> >
> >   struct kvm_vm_stat {
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 88c593f83b28..46a0fb9e0869 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -3419,6 +3419,42 @@ static inline bool kvm_can_mwait_in_guest(void)
> >                  boot_cpu_has(X86_FEATURE_ARAT);
> >   }
> >
> > +static int kvm_vm_ioctl_set_exit_msrs(struct kvm *kvm,
> > +                                     struct kvm_msr_list __user *user_msr_list)
> > +{
> > +       struct kvm_msr_list *msr_list, hdr;
> > +       size_t indices_size;
> > +
> > +       if (kvm->arch.user_exit_msrs != NULL)
> > +               return -EEXIST;
> > +
> > +       if (kvm->created_vcpus)
> > +               return -EINVAL;
>
> What if we need to change the set of user space handled MSRs
> dynamically, for example because a feature has been enabled through a
> previous MSR?

It's never been an issue for us, and this approach avoids the
messiness of having to back out the old changes to the MSR permissions
bitmaps, which is fraught. It can be done, but I would question the
ROI on the additional complexity. In any case, I think a VCPU ioctl
would be more appropriate than a VM ioctl if dynamic modifications
were allowed. For instance, in your example, the previous MSR write
that enabled a feature requiring a new user space MSR exit would
likely apply only to the VCPU on which that previous MSR write
happened.

> > +
> > +       if (copy_from_user(&hdr, user_msr_list, sizeof(hdr)))
> > +               return -EFAULT;
> > +
> > +       if (hdr.nmsrs >= MAX_IO_MSRS)
>
> This constant is not defined inside the scope of this patch. Is your
> patchset fully bisectable? Please make sure that every patch builds
> successfully on its own :).

I can't vouch for this patchset being fully bisectable, but this
particular constant was moved to x86.c ~13 years ago, in commit
313a3dc75da20 ("KVM: Portability: split kvm_vcpu_ioctl"). It should
not cause any build issues. :)



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux