On Mon, Jun 05, 2023, Jon Kohler wrote: > > On May 31, 2023, at 5:09 PM, Jon Kohler <jon@xxxxxxxxxxx> wrote: > >> The CPUID bits that enumerate support for a feature are independent from the CPUID > >> bits that enumerate what XCR0 bits are supported, i.e. what features can be saved > >> and restored via XSAVE/XRSTOR. > >> > >> KVM does mostly account for host XCR0, but in a very ad hoc way. E.g. MPX is > >> handled by manually checking host XCR0. > >> > >> if (kvm_mpx_supported()) > >> kvm_cpu_cap_check_and_set(X86_FEATURE_MPX); > >> > >> PKU manually checks too, but indirectly by looking at whether or not the kernel > >> has enabled CR4.OSPKE. > >> > >> if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE)) > >> kvm_cpu_cap_clear(X86_FEATURE_PKU); > >> > >> But unless I'm missing something, the various AVX and AMX bits rely solely on > >> boot_cpu_data, i.e. would break if someone added CONFIG_X86_AVX or CONFIG_X86_AMX. > > > > What if we simply moved static unsigned short xsave_cpuid_features[] … into > > xstate.h, which is already included in arch/x86/kvm/cpuid.c, and do > > something similar to what I’m proposing in this patch already > > > > This would future proof such breakages I’d imagine? > > > > void kvm_set_cpu_caps(void) > > { > > ... > > /* > > * Clear CPUID for XSAVE features that are disabled. > > */ > > for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) { > > unsigned short cid = xsave_cpuid_features[i]; > > > > /* Careful: X86_FEATURE_FPU is 0! */ > > if ((i != XFEATURE_FP && !cid) || !boot_cpu_has(cid) || > > !cpu_feature_enabled(cid)) > > kvm_cpu_cap_clear(cid); > > } > > … > > } > > > > Sean - following up on this rough idea code above, wanted to validate that > this was the direction you were thinking of having kvm_set_cpu_caps() clear > caps when a particular xsave feature was disabled? Ya, more or or less. But for KVM, that should be kvm_cpu_cap_has(), not boot_cpu_has(). And then I think KVM could actually WARN on a feature being disabled, i.e. put up a tripwire to detect if things change in the future and the kernel lets the user disable a feature that KVM wants to expose to a guest. Side topic, I find the "cid" nomenclature super confusing, and the established name in KVM is x86_feature. Something like this? for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) { unsigned int x86_feature = xsave_cpuid_features[i]; if (i != XFEATURE_FP && !x86_feature) continue; if (!kvm_cpu_cap_has(x86_feature)) continue; if (WARN_ON_ONCE(!cpu_feature_enabled(x86_feature))) kvm_cpu_cap_clear(x86_feature); }