Hi Dave, On 11/30/22 18:02, Dave Jiang wrote: > Add AER error handler callback to read the RAS capability structure > correctable error (CE) status register for the CXL device. Log the > error as a trace event and clear the error. For CXL devices, the driver > also needs to write back to the status register to clear the > unmasked correctable errors. > > See CXL spec rev3.0 8.2.4.16 for RAS capability structure CE Status > Register. > > Suggested-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx> > --- > > v6: > - Update commit log to point to RAS capability structure. (Bjorn) > - Change cxl_correctable_error_logging() to cxl_cor_error_detected(). > (Bjorn) > > drivers/cxl/pci.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 11f842df9807..02342830b612 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -622,10 +622,30 @@ static void cxl_error_resume(struct pci_dev *pdev) > dev->driver ? "successful" : "failed"); > } > > +static void cxl_cor_error_detected(struct pci_dev *pdev) > +{ > + struct cxl_dev_state *cxlds = pci_get_drvdata(pdev); > + struct cxl_memdev *cxlmd = cxlds->cxlmd; > + struct device *dev = &cxlmd->dev; > + void __iomem *addr; > + u32 status; > + > + if (!cxlds->regs.ras) > + return; > + > + addr = cxlds->regs.ras + CXL_RAS_CORRECTABLE_STATUS_OFFSET; > + status = le32_to_cpu(readl(addr)); > + if (status & CXL_RAS_CORRECTABLE_STATUS_MASK) { > + writel(status & CXL_RAS_CORRECTABLE_STATUS_MASK, addr); > + trace_cxl_aer_correctable_error(dev_name(dev), status); > + } > +} > + This will log PCI AER CEs only if there is also a RAS CE. My understanding (could be the problem) is AER CE's are normally reported. Will this be inconsistent with other error AER CE handling? Regards, Terry > static const struct pci_error_handlers cxl_error_handlers = { > .error_detected = cxl_error_detected, > .slot_reset = cxl_slot_reset, > .resume = cxl_error_resume, > + .cor_error_detected = cxl_cor_error_detected, > }; > > static struct pci_driver cxl_pci_driver = { > >