From: Suzuki K Poulose <suzuki.poulose@xxxxxxx> [ Upstream commit 143ba05d867af34827faf99e0eed4de27106c7cb ] We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed to the userspace and the CPU hwcaps used by the kernel, which include cpu features and CPU errata work arounds. Capabilities have some properties that decide how they should be treated : 1) Detection, i.e scope : A cap could be "detected" either : - if it is present on at least one CPU (SCOPE_LOCAL_CPU) Or - if it is present on all the CPUs (SCOPE_SYSTEM) 2) When is it enabled ? - A cap is treated as "enabled" when the system takes some action based on whether the capability is detected or not. e.g, setting some control register, patching the kernel code. Right now, we treat all caps are enabled at boot-time, after all the CPUs are brought up by the kernel. But there are certain caps, which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI) and kernel starts using them, even before the secondary CPUs are brought up. We would need a way to describe this for each capability. 3) Conflict on a late CPU - When a CPU is brought up, it is checked against the caps that are known to be enabled on the system (via verify_local_cpu_capabilities()). Based on the state of the capability on the CPU vs. that of System we could have the following combinations of conflict. x-----------------------------x | Type | System | Late CPU | ------------------------------| | a | y | n | ------------------------------| | b | n | y | x-----------------------------x Case (a) is not permitted for caps which are system features, which the system expects all the CPUs to have (e.g VHE). While (a) is ignored for all errata work arounds. However, there could be exceptions to the plain filtering approach. e.g, KPTI is an optional feature for a late CPU as long as the system already enables it. Case (b) is not permitted for errata work arounds which requires some work around, which cannot be delayed. And we ignore (b) for features. Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we are too late to enable it (because we change the allocation of ASIDs etc). So this calls for a lot more fine grained behavior for each capability. And if we define all the attributes to control their behavior properly, we may be able to use a single table for the CPU hwcaps (which cover errata and features, not the ELF HWCAPs). This is a prepartory step to get there. More bits would be added for the properties listed above. We are going to use a bit-mask to encode all the properties of a capabilities. This patch encodes the "SCOPE" of the capability. As such there is no change in how the capabilities are treated. Cc: Mark Rutland <mark.rutland@xxxxxxx> Reviewed-by: Dave Martin <dave.martin@xxxxxxx> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx> Signed-off-by: Will Deacon <will.deacon@xxxxxxx> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- arch/arm64/include/asm/cpufeature.h | 105 +++++++++++++++++++++++++++++++++--- arch/arm64/kernel/cpu_errata.c | 12 ++-- arch/arm64/kernel/cpufeature.c | 34 +++++------ 3 files changed, 122 insertions(+), 29 deletions(-) --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -85,16 +85,104 @@ struct arm64_ftr_reg { extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; -/* scope of capability check */ -enum { - SCOPE_SYSTEM, - SCOPE_LOCAL_CPU, -}; +/* + * CPU capabilities: + * + * We use arm64_cpu_capabilities to represent system features, errata work + * arounds (both used internally by kernel and tracked in cpu_hwcaps) and + * ELF HWCAPs (which are exposed to user). + * + * To support systems with heterogeneous CPUs, we need to make sure that we + * detect the capabilities correctly on the system and take appropriate + * measures to ensure there are no incompatibilities. + * + * This comment tries to explain how we treat the capabilities. + * Each capability has the following list of attributes : + * + * 1) Scope of Detection : The system detects a given capability by + * performing some checks at runtime. This could be, e.g, checking the + * value of a field in CPU ID feature register or checking the cpu + * model. The capability provides a call back ( @matches() ) to + * perform the check. Scope defines how the checks should be performed. + * There are two cases: + * + * a) SCOPE_LOCAL_CPU: check all the CPUs and "detect" if at least one + * matches. This implies, we have to run the check on all the + * booting CPUs, until the system decides that state of the + * capability is finalised. (See section 2 below) + * Or + * b) SCOPE_SYSTEM: check all the CPUs and "detect" if all the CPUs + * matches. This implies, we run the check only once, when the + * system decides to finalise the state of the capability. If the + * capability relies on a field in one of the CPU ID feature + * registers, we use the sanitised value of the register from the + * CPU feature infrastructure to make the decision. + * + * The process of detection is usually denoted by "update" capability + * state in the code. + * + * 2) Finalise the state : The kernel should finalise the state of a + * capability at some point during its execution and take necessary + * actions if any. Usually, this is done, after all the boot-time + * enabled CPUs are brought up by the kernel, so that it can make + * better decision based on the available set of CPUs. However, there + * are some special cases, where the action is taken during the early + * boot by the primary boot CPU. (e.g, running the kernel at EL2 with + * Virtualisation Host Extensions). The kernel usually disallows any + * changes to the state of a capability once it finalises the capability + * and takes any action, as it may be impossible to execute the actions + * safely. A CPU brought up after a capability is "finalised" is + * referred to as "Late CPU" w.r.t the capability. e.g, all secondary + * CPUs are treated "late CPUs" for capabilities determined by the boot + * CPU. + * + * 3) Verification: When a CPU is brought online (e.g, by user or by the + * kernel), the kernel should make sure that it is safe to use the CPU, + * by verifying that the CPU is compliant with the state of the + * capabilities finalised already. This happens via : + * + * secondary_start_kernel()-> check_local_cpu_capabilities() + * + * As explained in (2) above, capabilities could be finalised at + * different points in the execution. Each CPU is verified against the + * "finalised" capabilities and if there is a conflict, the kernel takes + * an action, based on the severity (e.g, a CPU could be prevented from + * booting or cause a kernel panic). The CPU is allowed to "affect" the + * state of the capability, if it has not been finalised already. + * + * 4) Action: As mentioned in (2), the kernel can take an action for each + * detected capability, on all CPUs on the system. Appropriate actions + * include, turning on an architectural feature, modifying the control + * registers (e.g, SCTLR, TCR etc.) or patching the kernel via + * alternatives. The kernel patching is batched and performed at later + * point. The actions are always initiated only after the capability + * is finalised. This is usally denoted by "enabling" the capability. + * The actions are initiated as follows : + * a) Action is triggered on all online CPUs, after the capability is + * finalised, invoked within the stop_machine() context from + * enable_cpu_capabilitie(). + * + * b) Any late CPU, brought up after (1), the action is triggered via: + * + * check_local_cpu_capabilities() -> verify_local_cpu_capabilities() + * + */ + + +/* Decide how the capability is detected. On a local CPU vs System wide */ +#define ARM64_CPUCAP_SCOPE_LOCAL_CPU ((u16)BIT(0)) +#define ARM64_CPUCAP_SCOPE_SYSTEM ((u16)BIT(1)) +#define ARM64_CPUCAP_SCOPE_MASK \ + (ARM64_CPUCAP_SCOPE_SYSTEM | \ + ARM64_CPUCAP_SCOPE_LOCAL_CPU) + +#define SCOPE_SYSTEM ARM64_CPUCAP_SCOPE_SYSTEM +#define SCOPE_LOCAL_CPU ARM64_CPUCAP_SCOPE_LOCAL_CPU struct arm64_cpu_capabilities { const char *desc; u16 capability; - int def_scope; /* default scope */ + u16 type; bool (*matches)(const struct arm64_cpu_capabilities *caps, int scope); /* * Take the appropriate actions to enable this capability for this CPU. @@ -119,6 +207,11 @@ struct arm64_cpu_capabilities { }; }; +static inline int cpucap_default_scope(const struct arm64_cpu_capabilities *cap) +{ + return cap->type & ARM64_CPUCAP_SCOPE_MASK; +} + extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS]; extern struct static_key_false arm64_const_caps_ready; --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -406,14 +406,14 @@ static bool has_ssbd_mitigation(const st #endif /* CONFIG_ARM64_SSBD */ #define MIDR_RANGE(model, min, max) \ - .def_scope = SCOPE_LOCAL_CPU, \ + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU, \ .matches = is_affected_midr_range, \ .midr_model = model, \ .midr_range_min = min, \ .midr_range_max = max #define MIDR_ALL_VERSIONS(model) \ - .def_scope = SCOPE_LOCAL_CPU, \ + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU, \ .matches = is_affected_midr_range, \ .midr_model = model, \ .midr_range_min = 0, \ @@ -517,14 +517,14 @@ const struct arm64_cpu_capabilities arm6 .desc = "Mismatched cache line size", .capability = ARM64_MISMATCHED_CACHE_LINE_SIZE, .matches = has_mismatched_cache_type, - .def_scope = SCOPE_LOCAL_CPU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU, .cpu_enable = cpu_enable_trap_ctr_access, }, { .desc = "Mismatched cache type", .capability = ARM64_MISMATCHED_CACHE_TYPE, .matches = has_mismatched_cache_type, - .def_scope = SCOPE_LOCAL_CPU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU, .cpu_enable = cpu_enable_trap_ctr_access, }, #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003 @@ -538,7 +538,7 @@ const struct arm64_cpu_capabilities arm6 { .desc = "Qualcomm Technologies Kryo erratum 1003", .capability = ARM64_WORKAROUND_QCOM_FALKOR_E1003, - .def_scope = SCOPE_LOCAL_CPU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU, .midr_model = MIDR_QCOM_KRYO, .matches = is_kryo_midr, }, @@ -613,7 +613,7 @@ const struct arm64_cpu_capabilities arm6 #ifdef CONFIG_ARM64_SSBD { .desc = "Speculative Store Bypass Disable", - .def_scope = SCOPE_LOCAL_CPU, + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU, .capability = ARM64_SSBD, .matches = has_ssbd_mitigation, }, --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -924,7 +924,7 @@ static const struct arm64_cpu_capabiliti { .desc = "GIC system register CPU interface", .capability = ARM64_HAS_SYSREG_GIC_CPUIF, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_useable_gicv3_cpuif, .sys_reg = SYS_ID_AA64PFR0_EL1, .field_pos = ID_AA64PFR0_GIC_SHIFT, @@ -935,7 +935,7 @@ static const struct arm64_cpu_capabiliti { .desc = "Privileged Access Never", .capability = ARM64_HAS_PAN, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64MMFR1_EL1, .field_pos = ID_AA64MMFR1_PAN_SHIFT, @@ -948,7 +948,7 @@ static const struct arm64_cpu_capabiliti { .desc = "LSE atomic instructions", .capability = ARM64_HAS_LSE_ATOMICS, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR0_EL1, .field_pos = ID_AA64ISAR0_ATOMICS_SHIFT, @@ -959,14 +959,14 @@ static const struct arm64_cpu_capabiliti { .desc = "Software prefetching using PRFM", .capability = ARM64_HAS_NO_HW_PREFETCH, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_no_hw_prefetch, }, #ifdef CONFIG_ARM64_UAO { .desc = "User Access Override", .capability = ARM64_HAS_UAO, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64MMFR2_EL1, .field_pos = ID_AA64MMFR2_UAO_SHIFT, @@ -980,21 +980,21 @@ static const struct arm64_cpu_capabiliti #ifdef CONFIG_ARM64_PAN { .capability = ARM64_ALT_PAN_NOT_UAO, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = cpufeature_pan_not_uao, }, #endif /* CONFIG_ARM64_PAN */ { .desc = "Virtualization Host Extensions", .capability = ARM64_HAS_VIRT_HOST_EXTN, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = runs_at_el2, .cpu_enable = cpu_copy_el2regs, }, { .desc = "32-bit EL0 Support", .capability = ARM64_HAS_32BIT_EL0, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64PFR0_EL1, .sign = FTR_UNSIGNED, @@ -1004,14 +1004,14 @@ static const struct arm64_cpu_capabiliti { .desc = "Reduced HYP mapping offset", .capability = ARM64_HYP_OFFSET_LOW, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = hyp_offset_low, }, #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 { .desc = "Kernel page table isolation (KPTI)", .capability = ARM64_UNMAP_KERNEL_AT_EL0, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = unmap_kernel_at_el0, .cpu_enable = kpti_install_ng_mappings, }, @@ -1019,7 +1019,7 @@ static const struct arm64_cpu_capabiliti { /* FP/SIMD is not implemented */ .capability = ARM64_HAS_NO_FPSIMD, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .min_field_value = 0, .matches = has_no_fpsimd, }, @@ -1027,7 +1027,7 @@ static const struct arm64_cpu_capabiliti { .desc = "Data cache clean to Point of Persistence", .capability = ARM64_HAS_DCPOP, - .def_scope = SCOPE_SYSTEM, + .type = ARM64_CPUCAP_SCOPE_SYSTEM, .matches = has_cpuid_feature, .sys_reg = SYS_ID_AA64ISAR1_EL1, .field_pos = ID_AA64ISAR1_DPB_SHIFT, @@ -1037,16 +1037,16 @@ static const struct arm64_cpu_capabiliti {}, }; -#define HWCAP_CAP(reg, field, s, min_value, type, cap) \ +#define HWCAP_CAP(reg, field, s, min_value, cap_type, cap) \ { \ .desc = #cap, \ - .def_scope = SCOPE_SYSTEM, \ + .type = ARM64_CPUCAP_SCOPE_SYSTEM, \ .matches = has_cpuid_feature, \ .sys_reg = reg, \ .field_pos = field, \ .sign = s, \ .min_field_value = min_value, \ - .hwcap_type = type, \ + .hwcap_type = cap_type, \ .hwcap = cap, \ } @@ -1140,7 +1140,7 @@ static void __init setup_elf_hwcaps(cons /* We support emulation of accesses to CPU ID feature registers */ elf_hwcap |= HWCAP_CPUID; for (; hwcaps->matches; hwcaps++) - if (hwcaps->matches(hwcaps, hwcaps->def_scope)) + if (hwcaps->matches(hwcaps, cpucap_default_scope(hwcaps))) cap_set_elf_hwcap(hwcaps); } @@ -1167,7 +1167,7 @@ static void update_cpu_capabilities(cons const char *info) { for (; caps->matches; caps++) { - if (!caps->matches(caps, caps->def_scope)) + if (!caps->matches(caps, cpucap_default_scope(caps))) continue; if (!cpus_have_cap(caps->capability) && caps->desc)