On Thu, 6 Jun 2019 20:36:48 +0800 Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: Hi All, I'm looking for some reviews on this series if anyone has time to take a look. Rasdaemon patches to match with this are on linux-edac but are waiting on the tracepoints merging. I'm not currently planning to upstream the qemu injection patches used to test this but anyone would like those I can certainly put a public branch up somewhere. Thanks, Jonathan > UEFI 2.8 defines a new CPER record Appendix N for CCIX Protocol Error Records > (PER). www.uefi.org > > These include Protocol Error Record logs which are defined in the > CCIX 1.0 Base Specification www.ccixconsortium.com. > > Handling of coherency protocol errors is complex and how Linux does this > will take some time to evolve. For now, fatal errors are handled via the > usual means and everything else is reported. > > There are 6 types of error defined, covering: > * Memory errors > * Cache errors > * Address translation unit errors > * CCIX port errors > * CCIX link errors > * Agent internal errors. > > The set includes tracepoints to report the errors to RAS Daemon and a patch > set for RAS Daemon will follow shortly. > > There are several open questions for this RFC. > 1. Reporting of vendor data. We have little choice but to do this via a > dynamic array as these blocks can take arbitrary size. I had hoped > no one would actually use these given the odd mismatch between a > standard error structure and non standard element, but there are > already designs out there that do use it. > 2. The trade off between explicit tracepoint fields, on which we might > want to filter, and the simplicity of a blob. I have gone for having > the whole of the block specific to the PER error type in an opaque blob. > Perhaps this is not the right balance? > 3. Whether defining 6 new tracepoints is sensible. I think it is: > * They are all defined by the CCIX specification as independant error > classes. > * Many of them can only be generated by particular types of agent. > * The handling required will vary widely depending on types. > In the kernel some map cleanly onto existing handling. Keeping the > whole flow separate will aide this. They vary by a similar amount > in scope to the RAS errors found on an existing system which have > independent tracepoints. > * Separating them out allows for filtering on the tracepoints by > elements that are not shared between them. > * Muxing the lot into one record type can lead to ugly code both in > kernel and in userspace. > > Rasdaemon patches will follow shortly. > > This patch is being distributed by the CCIX Consortium, Inc. (CCIX) to > you and other parties that are paticipating (the "participants") in the > Linux kernel with the understanding that the participants will use CCIX's > name and trademark only when this patch is used in association with the > Linux kernel and associated user space. > > CCIX is also distributing this patch to these participants with the > understanding that if any portion of the CCIX specification will be > used or referenced in the Linux kernel, the participants will not modify > the cited portion of the CCIX specification and will give CCIX propery > copyright attribution by including the following copyright notice with > the cited part of the CCIX specification: > "© 2019 CCIX CONSORTIUM, INC. ALL RIGHTS RESERVED." > > Jonathan Cameron (6): > efi / ras: CCIX Memory error reporting > efi / ras: CCIX Cache error reporting > efi / ras: CCIX Address Translation Cache error reporting > efi / ras: CCIX Port error reporting > efi / ras: CCIX Link error reporting > efi / ras: CCIX Agent internal error reporting > > drivers/acpi/apei/Kconfig | 8 + > drivers/acpi/apei/ghes.c | 59 ++ > drivers/firmware/efi/Kconfig | 5 + > drivers/firmware/efi/Makefile | 1 + > drivers/firmware/efi/cper-ccix.c | 916 +++++++++++++++++++++++++++++++ > drivers/firmware/efi/cper.c | 6 + > include/linux/cper.h | 333 +++++++++++ > include/ras/ras_event.h | 405 ++++++++++++++ > 8 files changed, 1733 insertions(+) > create mode 100644 drivers/firmware/efi/cper-ccix.c >