Terry Bowman wrote: > The AER service driver supports handling Downstream Port Protocol Errors in > Restricted CXL host (RCH) mode also known as CXL1.1. It needs the same > functionality for CXL PCIe Ports operating in Virtual Hierarchy (VH) > mode.[1] > > CXL and PCIe Protocol Error handling have different requirements that > necessitate a separate handling path. The AER service driver may try to > recover PCIe uncorrectable non-fatal errors (UCE). The same recovery is not > suitable for CXL PCIe Port devices because of potential for system memory > corruption. Instead, CXL Protocol Error handling must use a kernel panic > in the case of a fatal or non-fatal UCE. The AER driver's PCIe Protocol > Error handling does not panic the kernel in response to a UCE. > > Introduce a separate path for CXL Protocol Error handling in the AER > service driver. This will allow CXL Protocol Errors to use CXL specific > handling instead of PCIe handling. Add the CXL specific changes without > affecting or adding functionality in the PCIe handling. > > Make this update alongside the existing Downstream Port RCH error handling > logic, extending support to CXL PCIe Ports in VH mode. > > Remove is_internal_error(). is_internal_error() was used to determine if > an AER error was a CXL error. Instead, now rely on pcie_is_cxl_port() to > indicate the error is a CXL error. Wait, pcie_is_cxl_port() in isolation is insufficient, right? In other words, I would expect that when the response may escalate to panic() that the code should be reasonably certain that this *is* a CXL error. At a minimum that is: pcie_is_cxl_port() && is_internal_error() ...or am I missing something that it makes it unlikely that a standard PCIe error or other internal error type will not be thrown by a pcie_is_cxl_port() device?