On Sat, Dec 14, 2024 at 10:56:13AM +0000, Marc Zyngier wrote: > [+ Mark] > On Sat, 14 Dec 2024 00:52:08 +0000, > Mark Brown <broonie@xxxxxxxxxx> wrote: > > > > In commit 892f7237b3ff ("arm64: Delay initialisation of > > cpuinfo_arm64::reg_{zcr,smcr}") we moved access to ZCR, SMCR and SMIDR > > later in the boot process in order to ensure that we don't attempt to > > interact with them if SVE or SME is disabled on the command line. > > Unfortunately when initialising the boot CPU in init_cpu_features() we work > > on a copy of the struct cpuinfo_arm64 for the boot CPU used only during > > boot, not the percpu copy used by the sysfs code. > > > > Fix this by moving the handling for SMIDR_EL1 for the boot CPU to > > cpuinfo_store_boot_cpu() so it can operate on the percpu copy of the data. > > This reduces the potential for error that could come from having both the > > percpu and boot CPU copies in init_cpu_features(). > > > > This issue wasn't apparent when testing on emulated platforms that do not > > report values in this ID register. > > > > Fixes: 892f7237b3ff ("arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}") > > Signed-off-by: Mark Brown <broonie@xxxxxxxxxx> > > Cc: stable@xxxxxxxxxxxxxxx > > --- > > arch/arm64/kernel/cpufeature.c | 6 ------ > > arch/arm64/kernel/cpuinfo.c | 11 +++++++++++ > > 2 files changed, 11 insertions(+), 6 deletions(-) > > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > > index 6ce71f444ed84f9056196bb21bbfac61c9687e30..b88102fd2c20f77e25af6df513fda09a484e882e 100644 > > --- a/arch/arm64/kernel/cpufeature.c > > +++ b/arch/arm64/kernel/cpufeature.c > > @@ -1167,12 +1167,6 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) > > id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1))) { > > unsigned long cpacr = cpacr_save_enable_kernel_sme(); > > > > - /* > > - * We mask out SMPS since even if the hardware > > - * supports priorities the kernel does not at present > > - * and we block access to them. > > - */ > > - info->reg_smidr = read_cpuid(SMIDR_EL1) & ~SMIDR_EL1_SMPS; > > vec_init_vq_map(ARM64_VEC_SME); > > > > cpacr_restore(cpacr); > > diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c > > index d79e88fccdfce427507e7a34c5959ce6309cbd12..b7d403da71e5a01ed3943eb37e7a00af238771a2 100644 > > --- a/arch/arm64/kernel/cpuinfo.c > > +++ b/arch/arm64/kernel/cpuinfo.c > > @@ -499,4 +499,15 @@ void __init cpuinfo_store_boot_cpu(void) > > > > boot_cpu_data = *info; > > init_cpu_features(&boot_cpu_data); > > + > > + /* SMIDR_EL1 needs to be stored in the percpu data for sysfs */ > > + if (IS_ENABLED(CONFIG_ARM64_SME) && > > + id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1))) { > > + /* > > + * We mask out SMPS since even if the hardware > > + * supports priorities the kernel does not at present > > + * and we block access to them. > > + */ > > + info->reg_smidr = read_cpuid(SMIDR_EL1) & ~SMIDR_EL1_SMPS; > > + } > > } > > I don't understand the need to single out SMIDR_EL1. It seems to only > make things even more fragile than they already are by adding more > synchronisation phases. > > Why isn't the following a good enough fix? It makes it plain that > boot_cpu_data is only a copy of CPU0's initial boot state. > > diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c > index d79e88fccdfce..0cbb42fd48850 100644 > --- a/arch/arm64/kernel/cpuinfo.c > +++ b/arch/arm64/kernel/cpuinfo.c > @@ -497,6 +497,6 @@ void __init cpuinfo_store_boot_cpu(void) > struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0); > __cpuinfo_store_cpu(info); > > + init_cpu_features(info); > boot_cpu_data = *info; > - init_cpu_features(&boot_cpu_data); > } I think that change in isolation is fine, but I don't think that's the right fix. I think that what we did in commit: 892f7237b3ff ("arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}") ... introduces an anti-pattern that'd be nice to avoid. That broke the existing split of __cpuinfo_store_cpu() and init_cpu_features(), where the former read the ID regs, and the latter set up the features *without* altering the copy of the ID regs that was read. i.e. init_cpu_features() shouldn't write to its info argument at all. I understand that we have to do something as a bodge for broken FW which traps SME, but I'd much rather we did that within __cpuinfo_store_cpu(). Can we add something to check whether SME was disabled on the command line, and use that in __cpuinfo_store_cpu(), effectively reverting 892f7237b3ff? Mark.