On Fri, 03 Jan 2025 17:20:05 +0000, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > On Fri, Jan 03, 2025 at 02:26:35PM +0000, Marc Zyngier wrote: > > The hwcaps code that exposes SVE features to userspace only > > considers ID_AA64ZFR0_EL1, while this is only valid when > > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported. > > > > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the > > ID_AA64ZFR0_EL1 register is also 0. So far, so good. > > > > Things become a bit more interesting if the HW implements SME. > > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME* > > features. And these fields overlap with their SVE interpretations. > > But the architecture says that the SME and SVE feature sets must > > match, so we're still hunky-dory. > > > > This goes wrong if the HW implements SME, but not SVE. In this > > case, we end-up advertising some SVE features to userspace, even > > if the HW has none. That's because we never consider whether SVE > > is actually implemented. Oh well. > > > > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE > > being non-zero. > > > > Reported-by: Catalin Marinas <catalin.marinas@xxxxxxx> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > Cc: Will Deacon <will@xxxxxxxxxx> > > Cc: Mark Rutland <mark.rutland@xxxxxxx> > > Cc: Mark Brown <broonie@xxxxxxxxxx> > > Cc: stable@xxxxxxxxxxxxxxx > > I'd add: > > Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace") > > While at the time the code was correct, the architecture messed up our > assumptions with the introduction of SME. Good point. > > > @@ -3022,6 +3027,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { > > .matches = match, \ > > } > > > > +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap) \ > > + { \ > > + __HWCAP_CAP(#cap, cap_type, cap) \ > > + HWCAP_CPUID_MATCH(reg, field, min_value) \ > > + .matches = match, \ > > + } > > Do we actually need this macro? It is either this macro, or a large-ish switch/case statement doing the same thing. See below. > > > + > > #ifdef CONFIG_ARM64_PTR_AUTH > > static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = { > > { > > @@ -3050,6 +3062,18 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = { > > }; > > #endif > > > > +#ifdef CONFIG_ARM64_SVE > > +static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope) > > +{ > > + u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope); > > + > > + if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP) > > + return false; > > + > > + return has_user_cpuid_feature(cap, scope); > > +} > > +#endif > > We can name this has_sve_feature() and use it with the existing > HWCAP_CAP_MATCH() macro. I think it would look identical. I don't think that works. HWCAP_CAP_MATCH() doesn't take the reg/field/limit information that we need to compute the capability. Without such information neatly populated in arm64_cpu_capabilities, you can't call has_user_cpuid_feature(). A previous incarnation of the same patch was using that macro. But you then end-up with having to map the cap with the field/limit and perform the check "by hand". Roughly, this would look like this: diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index d793ca08549cd..76566a8bcdd3c 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -3065,12 +3065,22 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = { #ifdef CONFIG_ARM64_SVE static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope) { - u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope); + u64 zfr0; - if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP) + if (!system_supports_sve()) return false; - return has_user_cpuid_feature(cap, scope); + zfr0 = __read_scoped_sysreg(SYS_ID_AA64ZFR0_EL1, scope); + + switch (cap->cap) { + case KERNEL_HWCAP_SVE2P1: + return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, SVEver) >= SYS_ID_AA64ZFR0_EL1_SVEver_SVE2p1; + case KERNEL_HWCAP_SVE2: + return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, SVEver) >= SYS_ID_AA64ZFR0_EL1_SVEver_SVE2; + case KERNEL_HWCAP_SVEAES: + return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, AES) >= SYS_ID_AA64ZFR0_EL1_AES_IMP; + [...] + } } #endif Frankly, I don't think this is worth it, and you still need to hack read_scoped_sysreg(). > > We might even be able to use system_supports_sve() directly and avoid > changing read_scoped_sysreg(). setup_user_features() is called in > smp_cpus_done() after setup_system_features(), so using > system_supports_sve() directly should be fine here. Yeah, that should work. Thanks, M. -- Without deviation from the norm, progress is not possible.