On Fri, Sep 27, 2019 at 10:30:38AM -0700, Jim Mattson wrote: > On Fri, Sep 27, 2019 at 10:14 AM Sean Christopherson > <sean.j.christopherson@xxxxxxxxx> wrote: > > > > On Fri, Sep 27, 2019 at 06:32:27PM +0200, Paolo Bonzini wrote: > > > On 27/09/19 18:10, Jim Mattson wrote: > > > > On Fri, Sep 27, 2019 at 9:06 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > >> > > > >> On 27/09/19 17:58, Xiaoyao Li wrote: > > > >>> Indeed, "KVM_GET_MSR_INDEX_LIST" returns the guest msrs that KVM supports and > > > >>> they are free from different guest configuration since they're initialized when > > > >>> kvm module is loaded. > > > >>> > > > >>> Even though some MSRs are not exposed to guest by clear their related cpuid > > > >>> bits, they are still saved/restored by QEMU in the same fashion. > > > >>> > > > >>> I wonder should we change "KVM_GET_MSR_INDEX_LIST" per VM? > > > >> > > > >> We can add a per-VM version too, yes. > > > > > > There is one problem with that: KVM_SET_CPUID2 is a vCPU ioctl, not a VM > > > ioctl. > > > > > > > Should the system-wide version continue to list *some* supported MSRs > > > > and *some* unsupported MSRs, with no rhyme or reason? Or should we > > > > codify what that list contains? > > > > > > The optimal thing would be for it to list only MSRs that are > > > unconditionally supported by all VMs and are part of the runtime state. > > > MSRs that are not part of the runtime state, such as the VMX > > > capabilities, should be returned by KVM_GET_MSR_FEATURE_INDEX_LIST. > > > > > > This also means that my own commit 95c5c7c77c06 ("KVM: nVMX: list VMX > > > MSRs in KVM_GET_MSR_INDEX_LIST", 2019-07-02) was incorrect. > > > Unfortunately, that commit was done because userspace (QEMU) has a > > > genuine need to detect whether KVM is new enough to support the > > > IA32_VMX_VMFUNC MSR. > > > > > > Perhaps we can make all MSRs supported unconditionally if > > > host_initiated. For unsupported performance counters it's easy to make > > > them return 0, and allow setting them to 0, if host_initiated > > > > I don't think we need to go that far. Allowing any ol' MSR access seems > > like it would cause more problems than it would solve, e.g. userspace > > could completely botch something and never know. > > > > For the perf MSRs, could we enumerate all arch perf MSRs that are supported > > by hardware? That would also be the list of MSRs that host_initiated MSR > > accesses can touch regardless of guest support. > > > > Something like: > > > > case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0+INTEL_PMC_MAX_GENERIC: > > case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0+INTEL_PMC_MAX_GENERIC: > > if (kvm_pmu_is_valid_msr(vcpu, msr)) > > return kvm_pmu_set_msr(vcpu, msr_info); > > else if (msr <= num_hw_counters) > > break; > > return 1; > > That doesn't quite work, since you need a vcpu, and > KVM_GET_MSR_INDEX_LIST is a system-wide ioctl, not a VCPU ioctl. That'd be for the {kvm,vmx}_set_msr() flow. The KVM_GET_MSR_INDEX_LIST flow would report all MSRs from 0..num_hw_counters, where num_hw_counters is pulled from CPUID.