On Thu, Sep 14, 2017 at 10:45:07AM +0100, Alex Bennée wrote: > > Dave Martin <Dave.Martin@xxxxxxx> writes: > > > This patch uses the cpufeatures framework to determine common SVE > > capabilities and vector lengths, and configures the runtime SVE > > support code appropriately. > > > > ZCR_ELx is not really a feature register, but it is convenient to > > use it as a template for recording the maximum vector length > > supported by a CPU, using the LEN field. This field is similar to > > a feature field in that it is a contiguous bitfield for which we > > want to determine the minimum system-wide value. This patch adds > > ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate > > custom code to populate it. Finding the minimum supported value of > > the LEN field is left to the cpufeatures framework in the usual > > way. > > > > The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet, > > so for now we just require it to be zero. > > > > Note that much of this code is dormant and SVE still won't be used > > yet, since system_supports_sve() remains hardwired to false. > > > > Signed-off-by: Dave Martin <Dave.Martin@xxxxxxx> > > Cc: Alex Bennée <alex.bennee@xxxxxxxxxx> > > Cc: Suzuki K Poulose <Suzuki.Poulose@xxxxxxx> > > > > --- > > > > Changes since v1 > > ---------------- > > > > Requested by Alex Bennée: > > > > * Thin out BUG_ON()s: > > Redundant BUG_ON()s and ones that just check invariants are removed. > > Important sanity-checks are migrated to WARN_ON()s, with some > > minimal best-effort patch-up code. > > > > Other changes related to Alex Bennée's comments: > > > > * Migrate away from magic numbers for converting VL to VQ. > > > > Requested by Suzuki Poulose: > > > > * Make sve_vq_map __ro_after_init. > > > > Other changes related to Suzuki Poulose's comments: > > > > * Rely on cpufeatures for not attempting to update the vq map after boot. > > --- > > arch/arm64/include/asm/cpu.h | 4 ++ > > arch/arm64/include/asm/cpufeature.h | 29 ++++++++++ > > arch/arm64/include/asm/fpsimd.h | 10 ++++ > > arch/arm64/kernel/cpufeature.c | 50 +++++++++++++++++ > > arch/arm64/kernel/cpuinfo.c | 6 ++ > > arch/arm64/kernel/fpsimd.c | 106 +++++++++++++++++++++++++++++++++++- > > 6 files changed, 202 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h > > index 889226b..8839227 100644 > > --- a/arch/arm64/include/asm/cpu.h > > +++ b/arch/arm64/include/asm/cpu.h > > @@ -41,6 +41,7 @@ struct cpuinfo_arm64 { > > u64 reg_id_aa64mmfr2; > > u64 reg_id_aa64pfr0; > > u64 reg_id_aa64pfr1; > > + u64 reg_id_aa64zfr0; > > > > u32 reg_id_dfr0; > > u32 reg_id_isar0; > > @@ -59,6 +60,9 @@ struct cpuinfo_arm64 { > > u32 reg_mvfr0; > > u32 reg_mvfr1; > > u32 reg_mvfr2; > > + > > + /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */ > > + u64 reg_zcr; > > }; > > > > DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data); > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > > index 4ea3441..d98e7ba 100644 > > --- a/arch/arm64/include/asm/cpufeature.h > > +++ b/arch/arm64/include/asm/cpufeature.h > > @@ -10,7 +10,9 @@ > > #define __ASM_CPUFEATURE_H > > > > #include <asm/cpucaps.h> > > +#include <asm/fpsimd.h> > > #include <asm/hwcap.h> > > +#include <asm/sigcontext.h> > > #include <asm/sysreg.h> > > > > /* > > @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0) > > return val == ID_AA64PFR0_EL0_32BIT_64BIT; > > } > > > > +static inline bool id_aa64pfr0_sve(u64 pfr0) > > +{ > > + u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT); > > + > > + return val > 0; > > +} > > + > > void __init setup_cpu_features(void); > > > > void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, > > @@ -267,6 +276,26 @@ static inline bool system_supports_sve(void) > > return false; > > } > > > > +/* > > + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE > > + * vector length. > > + * Use only if SVE is present. This function clobbers the SVE vector length. > > + */ > > :nit whitespace formatting. I'll add some newlines now to make this cleaner. /* * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE * vector length. * * Use only if SVE is present. * This function clobbers the SVE vector length. */ OK? > > > +static u64 __maybe_unused read_zcr_features(void) > > +{ > > + u64 zcr; > > + unsigned int vq_max; > > + > > + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1); > > I'm confused, why are we writing something here? You mention clobbering > the SVE vector length but what was the point? Hmm, this deserves a comment -- coming back to this code, I had to think about it. Are the following extra comments sufficient explanation? /* * Set the maximum possible VL, and write zeroes to all other * bits to see if they stick. */ write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1); zcr = read_sysreg_s(SYS_ZCR_EL1); zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* flag up sticky 1s outside LEN field */ vq_max = sve_vq_from_vl(sve_get_vl()); zcr |= vq_max - 1; /* set LEN field to maximum effective value */ [...] > Otherwise: > > Reviewed-by: Alex Bennée <alex.bennee@xxxxxxxxxx> I'll wait on your responses to the above first. Cheers ---Dave