Hi Sai, (CC: +Tyler) On 05/12/2019 09:53, Sai Prakash Ranjan wrote: > This adds DT bindings for Kryo EDAC implemented with RAS > extensions on KRYO{3,4}XX CPU cores for reporting of cache > errors. KRYO{3,4}XX isn't the only SoC with the RAS extensions. The DT needs to convey the range of ways this armv8 RAS extensions stuff can be wired up. The folk who look after the ACPI specs have made a start: https://static.docs.arm.com/den0085/a/DEN0085_RAS_ACPI_1.0_BETA_1.pdf (I suspect that isn't the latest version, I'll try and find out) I'd like the ACPI table and DT to convey the same information so that we don't need to convert or infer things in the driver. If something is missing, we should get it added! > diff --git a/Documentation/devicetree/bindings/edac/qcom-kryo-edac.yaml b/Documentation/devicetree/bindings/edac/qcom-kryo-edac.yaml > new file mode 100644 > index 000000000000..1a39429a73b4 > --- /dev/null > +++ b/Documentation/devicetree/bindings/edac/qcom-kryo-edac.yaml > @@ -0,0 +1,67 @@ > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/edac/qcom-kryo-edac.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Kryo Error Detection and Correction(EDAC) > + > +maintainers: > + - Sai Prakash Ranjan <saiprakash.ranjan@xxxxxxxxxxxxxx> > + > +description: | > + Kryo EDAC is defined to describe on-chip error detection and correction > + for the Kryo CPU cores which implement RAS extensions. Please don't make this Kryo specific, otherwise this binding becomes an extra thing we need to support with a 'v8.2 RAS' driver. What I'd like is a single 'armv82_ras' edac driver that handles faults and errors reported by interrupts, and interacts with the arch code's handling of 'external aborts'. This should work for all platforms using v8.2 RAS and later. > + It will report > + all Single Bit Errors and Double Bit Errors found in L1/L2 caches in > + in two registers ERXSTATUS_EL1 and ERXMISC0_EL1. L3-SCU cache errors > + are reported in ERR1STATUS and ERR1MISC0 registers. > + ERXSTATUS_EL1 - Selected Error Record Primary Status Register, EL1 > + ERXMISC0_EL1 - Selected Error Record Miscellaneous Register 0, EL1 > + ERR1STATUS - Error Record Primary Status Register > + ERR1MISC0 - Error Record Miscellaneous Register 0 > + Current implementation of Kryo ECC(Error Correcting Code) mechanism is > + based on interrupts. Your SoC picked the system registers as the interface to these component's registers. The binding would need to specify which index the 'l1-l2' records start at, and how many there are. The same for the 'l3-scu'. You can't hard code these, they are different on other platforms. There is also an MMIO interface which needs a base address, along with the index and ranges. (which may be different). The same component may use both the system register and the MMIO interface. This stuff is likely to vary on big/little systems, so you need a way of describing which CPUs the settings refer to. This probably isn't something the ACPI tables capture as ACPI machines are typically homogenous. Thanks, James