On Mon, Feb 24, 2020 at 05:32:54PM +0100, Vitaly Kuznetsov wrote: > Sean Christopherson <sean.j.christopherson@xxxxxxxxx> writes: > > > Calculate the CPUID masks for KVM_GET_SUPPORTED_CPUID at load time using > > what is effectively a KVM-adjusted copy of boot_cpu_data, or more > > precisely, the x86_capability array in boot_cpu_data. > > > > In terms of KVM support, the vast majority of CPUID feature bits are > > constant, and *all* feature support is known at KVM load time. Rather > > than apply boot_cpu_data, which is effectively read-only after init, > > at runtime, copy it into a KVM-specific array and use *that* to mask > > CPUID registers. > > > > In additional to consolidating the masking, kvm_cpu_caps can be adjusted > > by SVM/VMX at load time and thus eliminate all feature bit manipulation > > in ->set_supported_cpuid(). > > > > No functional change intended. > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > --- > > arch/x86/kvm/cpuid.c | 229 +++++++++++++++++++++++-------------------- > > arch/x86/kvm/cpuid.h | 19 ++++ > > arch/x86/kvm/x86.c | 2 + > > 3 files changed, 142 insertions(+), 108 deletions(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > index 20a7af320291..c2a4c9df49a9 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -24,6 +24,13 @@ > > #include "trace.h" > > #include "pmu.h" > > > > +/* > > + * Unlike "struct cpuinfo_x86.x86_capability", kvm_cpu_caps doesn't need to be > > + * aligned to sizeof(unsigned long) because it's not accessed via bitops. > > + */ > > +u32 kvm_cpu_caps[NCAPINTS] __read_mostly; > > +EXPORT_SYMBOL_GPL(kvm_cpu_caps); > > + > > static u32 xstate_required_size(u64 xstate_bv, bool compacted) > > { > > int feature_bit = 0; > > @@ -259,7 +266,119 @@ static __always_inline void cpuid_entry_mask(struct kvm_cpuid_entry2 *entry, > > { > > u32 *reg = cpuid_entry_get_reg(entry, leaf * 32); > > > > - *reg &= boot_cpu_data.x86_capability[leaf]; > > + BUILD_BUG_ON(leaf > ARRAY_SIZE(kvm_cpu_caps)); > > Should this be '>=' ? Yep, nice catch. > > + *reg &= kvm_cpu_caps[leaf]; > > +} > > + > > +static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask) > > +{ > > + reverse_cpuid_check(leaf); > > + kvm_cpu_caps[leaf] &= mask; > > +} > > + > > +void kvm_set_cpu_caps(void) > > +{ > > + unsigned f_nx = is_efer_nx() ? F(NX) : 0; > > +#ifdef CONFIG_X86_64 > > + unsigned f_gbpages = F(GBPAGES); > > + unsigned f_lm = F(LM); > > +#else > > + unsigned f_gbpages = 0; > > + unsigned f_lm = 0; > > +#endif > > Three too many bare 'unsinged's :-) Roger that, I'll fix this up. > > + > > + BUILD_BUG_ON(sizeof(kvm_cpu_caps) > > > + sizeof(boot_cpu_data.x86_capability)); > > + > > + memcpy(&kvm_cpu_caps, &boot_cpu_data.x86_capability, > > + sizeof(kvm_cpu_caps)); > > + > > + kvm_cpu_cap_mask(CPUID_1_EDX, > > + F(FPU) | F(VME) | F(DE) | F(PSE) | > > + F(TSC) | F(MSR) | F(PAE) | F(MCE) | > > + F(CX8) | F(APIC) | 0 /* Reserved */ | F(SEP) | > > + F(MTRR) | F(PGE) | F(MCA) | F(CMOV) | > > + F(PAT) | F(PSE36) | 0 /* PSN */ | F(CLFLUSH) | > > + 0 /* Reserved, DS, ACPI */ | F(MMX) | > > + F(FXSR) | F(XMM) | F(XMM2) | F(SELFSNOOP) | > > + 0 /* HTT, TM, Reserved, PBE */ > > + ); > > + > > + kvm_cpu_cap_mask(CPUID_8000_0001_EDX, > > + F(FPU) | F(VME) | F(DE) | F(PSE) | > > + F(TSC) | F(MSR) | F(PAE) | F(MCE) | > > + F(CX8) | F(APIC) | 0 /* Reserved */ | F(SYSCALL) | > > + F(MTRR) | F(PGE) | F(MCA) | F(CMOV) | > > + F(PAT) | F(PSE36) | 0 /* Reserved */ | > > + f_nx | 0 /* Reserved */ | F(MMXEXT) | F(MMX) | > > + F(FXSR) | F(FXSR_OPT) | f_gbpages | F(RDTSCP) | > > + 0 /* Reserved */ | f_lm | F(3DNOWEXT) | F(3DNOW) > > + ); > > + > > + kvm_cpu_cap_mask(CPUID_1_ECX, > > + /* NOTE: MONITOR (and MWAIT) are emulated as NOP, > > + * but *not* advertised to guests via CPUID ! */ > > + F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ | > > + 0 /* DS-CPL, VMX, SMX, EST */ | > > + 0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ | > > + F(FMA) | F(CX16) | 0 /* xTPR Update, PDCM */ | > > + F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) | > > + F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) | > > + 0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) | > > + F(F16C) | F(RDRAND) > > + ); > > I would suggest we order things by CPUID_NUM here, i.e. > > CPUID_1_ECX > CPUID_1_EDX > CPUID_7_1_EAX > CPUID_7_0_EBX > CPUID_7_ECX > CPUID_7_EDX > CPUID_D_1_EAX > ... Hmm, generally speaking I agree, but I didn't want to change the ordering in this patch when moving the code. Throw a patch on top? Leave as is? Something else?