On Fri, Sep 27, 2019 at 10:22:51AM -0700, Jim Mattson wrote: > On Fri, Sep 27, 2019 at 9:32 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > > On 27/09/19 18:10, Jim Mattson wrote: > > > On Fri, Sep 27, 2019 at 9:06 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > >> > > >> On 27/09/19 17:58, Xiaoyao Li wrote: > > >>> Indeed, "KVM_GET_MSR_INDEX_LIST" returns the guest msrs that KVM supports and > > >>> they are free from different guest configuration since they're initialized when > > >>> kvm module is loaded. > > >>> > > >>> Even though some MSRs are not exposed to guest by clear their related cpuid > > >>> bits, they are still saved/restored by QEMU in the same fashion. > > >>> > > >>> I wonder should we change "KVM_GET_MSR_INDEX_LIST" per VM? > > >> > > >> We can add a per-VM version too, yes. > > > > There is one problem with that: KVM_SET_CPUID2 is a vCPU ioctl, not a VM > > ioctl. > > > > > Should the system-wide version continue to list *some* supported MSRs > > > and *some* unsupported MSRs, with no rhyme or reason? Or should we > > > codify what that list contains? > > > > The optimal thing would be for it to list only MSRs that are > > unconditionally supported by all VMs and are part of the runtime state. > > MSRs that are not part of the runtime state, such as the VMX > > capabilities, should be returned by KVM_GET_MSR_FEATURE_INDEX_LIST. > > > > This also means that my own commit 95c5c7c77c06 ("KVM: nVMX: list VMX > > MSRs in KVM_GET_MSR_INDEX_LIST", 2019-07-02) was incorrect. > > Unfortunately, that commit was done because userspace (QEMU) has a > > genuine need to detect whether KVM is new enough to support the > > IA32_VMX_VMFUNC MSR. > > > > Perhaps we can make all MSRs supported unconditionally if > > host_initiated. For unsupported performance counters it's easy to make > > them return 0, and allow setting them to 0, if host_initiated (BTW, how > > did you pick 32? is there any risk of conflicts with other MSRs?). > > 32 comes from INTEL_PMC_MAX_GENERIC. There are definitely conflicts. > (Sorry; this should have occurred to me earlier.) 32 event selectors > would occupy indices [0x186, 0x1a6). But on the architectural MSR > list, only indices up through 0x197 are "reserved" (presumably for > future event selectors). 32 GP counters would occupy indices [0xc1, > 0xe1). But on the architectural MSR list, only indices up through 0xc8 > are defined for GP counters. None are marked "reserved" for future > expansion, but none in the range (0xc8, 0xe1) are defined either. > > Perhaps INTEL_MAX_PMC_GENERIC should be reduced to 18. If we removed > event selectors and counters above 18, would my original approach > work? Heh, VMX is technically available on P4 processors, which don't support the architectural PMU. Generating the list based on hardware CPUID seems both safer and easier.