Re: [PATCH] kvm: x86: Add Intel PMU MSRs to msrs_to_save[]

Jim Mattson <jmattson@xxxxxxxxxx> · Fri, 27 Sep 2019 10:30:38 -0700

On Fri, Sep 27, 2019 at 10:14 AM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Fri, Sep 27, 2019 at 06:32:27PM +0200, Paolo Bonzini wrote:
> > On 27/09/19 18:10, Jim Mattson wrote:
> > > On Fri, Sep 27, 2019 at 9:06 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
> > >>
> > >> On 27/09/19 17:58, Xiaoyao Li wrote:
> > >>> Indeed, "KVM_GET_MSR_INDEX_LIST" returns the guest msrs that KVM supports and
> > >>> they are free from different guest configuration since they're initialized when
> > >>> kvm module is loaded.
> > >>>
> > >>> Even though some MSRs are not exposed to guest by clear their related cpuid
> > >>> bits, they are still saved/restored by QEMU in the same fashion.
> > >>>
> > >>> I wonder should we change "KVM_GET_MSR_INDEX_LIST" per VM?
> > >>
> > >> We can add a per-VM version too, yes.
> >
> > There is one problem with that: KVM_SET_CPUID2 is a vCPU ioctl, not a VM
> > ioctl.
> >
> > > Should the system-wide version continue to list *some* supported MSRs
> > > and *some* unsupported MSRs, with no rhyme or reason? Or should we
> > > codify what that list contains?
> >
> > The optimal thing would be for it to list only MSRs that are
> > unconditionally supported by all VMs and are part of the runtime state.
> >  MSRs that are not part of the runtime state, such as the VMX
> > capabilities, should be returned by KVM_GET_MSR_FEATURE_INDEX_LIST.
> >
> > This also means that my own commit 95c5c7c77c06 ("KVM: nVMX: list VMX
> > MSRs in KVM_GET_MSR_INDEX_LIST", 2019-07-02) was incorrect.
> > Unfortunately, that commit was done because userspace (QEMU) has a
> > genuine need to detect whether KVM is new enough to support the
> > IA32_VMX_VMFUNC MSR.
> >
> > Perhaps we can make all MSRs supported unconditionally if
> > host_initiated.  For unsupported performance counters it's easy to make
> > them return 0, and allow setting them to 0, if host_initiated
>
> I don't think we need to go that far.  Allowing any ol' MSR access seems
> like it would cause more problems than it would solve, e.g. userspace
> could completely botch something and never know.
>
> For the perf MSRs, could we enumerate all arch perf MSRs that are supported
> by hardware?  That would also be the list of MSRs that host_initiated MSR
> accesses can touch regardless of guest support.
>
> Something like:
>
>         case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0+INTEL_PMC_MAX_GENERIC:
>         case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0+INTEL_PMC_MAX_GENERIC:
>                 if (kvm_pmu_is_valid_msr(vcpu, msr))
>                         return kvm_pmu_set_msr(vcpu, msr_info);
>                 else if (msr <= num_hw_counters)
>                         break;
>                 return 1;

That doesn't quite work, since you need a vcpu, and
KVM_GET_MSR_INDEX_LIST is a system-wide ioctl, not a VCPU ioctl.

> > (BTW, how did you pick 32?  is there any risk of conflicts with other MSRs?).
>
> Presumably because INTEL_PMC_MAX_GENERIC is 32.
>
> > I'm not sure of the best set of values to allow for VMX caps, especially
> > with the default0/default1 stuff going on for execution controls.  But
> > perhaps that would be the simplest thing to do.
> >
> > One possibility would be to make a KVM_GET_MSR_INDEX_LIST variant that
> > is a system ioctl and takes a CPUID vector.  I'm worried that it would
> > be tedious to get right and hardish to keep correct---so I'm not sure
> > it's a good idea.
> >
> > Paolo