[PATCH v5 0/9] efi/cxl-cper: Report CPER CXL component events through trace events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Series status/background
========================

Smita has been a great help with this series.  Thank you again!

Smita's testing found that the GHES code ended up printing the events
twice.  This version avoids the duplicate print by calling the callback
from the GHES code instead of the EFI code as suggested by Dan.

Dependencies
============

NOTE this series still depends on Dan's addition of a device guard[1].
Therefore, the base commit is not a stable commit.  I've pushed a branch
with this commit included for testing if folks are interested.[2]

[1] https://lore.kernel.org/all/170250854466.1522182.17555361077409628655.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/
[2] https://github.com/weiny2/linux-kernel/tree/cxl-cper-2023-12-20

Cover letter
============

CXL Component Events, as defined by EFI 2.10 Section N.2.14, wrap a
mostly CXL event payload in an EFI Common Platform Error Record (CPER)
record.  If a device is configured for firmware first CXL event records
are not sent directly to the host.

The CXL sub-system uniquely has DPA to HPA translation information.  It
also already has event format tracing.  Restructure the code to make
sharing the data between CPER/event logs most efficient.  Then send the
CXL CPER records to the CXL sub-system for processing.

With event logs the events interrupt the driver directly.  In the EFI
case events are wrapped with device information which allows the CXL
subsystem to identify the PCI device.

Previous version considered matching the memdev differently.  However,
the most robust was to find the PCI device via Bus, Device, Function and
use the PCI device to find the driver data.

CPER records are identified with GUID's while CXL event logs contain
UUID's.  The UUID is reported for all events no matter the source.
While the UUID is redundant for the known events the UUID's are already
used by rasdaemon.  To keep compatibility UUIDs are still reported.

In addition this series cleans up the UUID defines.

Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
---
Changes in v5:
- Smita/djbw: trigger trace from ghes_do_proc()
- Jonathan: split out pci scoped based functions to it's own patch
- Jonathan: remove unneeded static uuid variables
- Smita/djbw: trace an unknown event type as a generic with null UUID
- Jonathan: code clean ups
- Link to v4: https://lore.kernel.org/r/20231215-cxl-cper-v4-0-01b6dab44fcd@xxxxxxxxx

---
Ira Weiny (9):
      cxl/trace: Pass uuid explicitly to event traces
      cxl/events: Promote CXL event structures to a core header
      cxl/events: Create common event UUID defines
      cxl/events: Remove passing a UUID to known event traces
      cxl/events: Separate UUID from event structures
      cxl/events: Create a CXL event union
      acpi/ghes: Process CXL Component Events
      PCI: Define scoped based management functions
      cxl/pci: Register for and process CPER events

 drivers/acpi/apei/ghes.c     |  88 +++++++++++++++++++++++
 drivers/cxl/core/mbox.c      |  87 +++++++++++------------
 drivers/cxl/core/trace.h     |  14 ++--
 drivers/cxl/cxlmem.h         | 110 +++++++----------------------
 drivers/cxl/pci.c            |  58 ++++++++++++++-
 include/linux/cxl-event.h    | 162 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h          |   2 +
 tools/testing/cxl/test/mem.c | 163 ++++++++++++++++++++++++-------------------
 8 files changed, 476 insertions(+), 208 deletions(-)
---
base-commit: 6436863dfabce0d7ac416c8dc661fd513b967d39
change-id: 20230601-cxl-cper-26ffc839c6c6

Best regards,
-- 
Ira Weiny <ira.weiny@xxxxxxxxx>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux