On Wed, May 31, 2023, Borislav Petkov wrote: > On Wed, May 31, 2023 at 08:18:34PM +0000, Sean Christopherson wrote: > > Assert that the to-be-checked bit passed to cpu_feature_enabled() is a > > compile-time constant instead of applying the DISABLED_MASK_BIT_SET() > > logic if and only if the bit is a constant. Conditioning the check on > > the bit being constant instead of requiring the bit to be constant could > > result in compiler specific kernel behavior, e.g. running on hardware that > > supports a disabled feature would return %false if the compiler resolved > > the bit to a constant, but %true if not. > > Uff, more mirroring CPUID inconsistencies. > > So *actually*, we should clear all those build-time disabled bits from > x86_capability so that this doesn't happen. Heh, I almost suggested that, but there is a non-zero amount of code that wants to ignore the disabled bits and query the "raw" CPUID information. In quotes because the kernel still massages x86_capability. Changing that behavior will require auditing a lot of code, because in most cases any breakage will be mostly silent, e.g. loss of features/performance and not explosions. E.g. KVM emulates UMIP when it's not supported in hardware, and so advertises UMIP support irrespective of hardware/host support. But emulating UMIP is imperfect and suboptimal (requires intercepting L*DT instructions), so KVM intercepts L*DT instructions iff UMIP is not supported in hardware, as detected by boot_cpu_has(X86_FEATURE_UMIP). The comment for cpu_feature_enabled() even calls out this type of use case: Use the cpu_has() family if you want true runtime testing of CPU features, like in hypervisor code where you are supporting a possible guest feature where host support for it is not relevant. That said, the behavior of cpu_has() is wildly inconsistent, e.g. LA57 is indirectly cleared in x86_capability if it's a disabled bit because of this code in early_identify_cpu(). if (!pgtable_l5_enabled()) setup_clear_cpu_cap(X86_FEATURE_LA57); KVM works around that by manually doing CPUID to query hardware directly: /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) kvm_cpu_cap_set(X86_FEATURE_LA57); So yeah, I 100% agree the current state is messy and would love to have cpu_feature_enabled() be a pure optimization with respect to boot_cpu_has(), but it's not as trivial at it looks.