On 11/18/2024 4:37 AM, Lukas Wunner wrote: > On Wed, Nov 13, 2024 at 03:54:21PM -0600, Terry Bowman wrote: >> Non-fatal CXL UCE errors will be treated as fatal. > Hm, I wonder why? > >> --- a/drivers/pci/pcie/aer.c >> +++ b/drivers/pci/pcie/aer.c >> @@ -1048,7 +1048,10 @@ static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) >> pdrv->cxl_err_handler->cor_error_detected(dev); >> >> pcie_clear_device_status(dev); >> - } >> + } else if (info->severity == AER_NONFATAL) >> + cxl_do_recovery(dev); >> + else if (info->severity == AER_FATAL) >> + cxl_do_recovery(dev); >> } > Nit: Maybe use curly braces and collapse both if-block into one. I'll make the change. >> + cxl_walk_bridge(bridge, cxl_report_error_detected, &status); >> + if (status) >> + panic("CXL cachemem error. Invoking panic"); > Nit: This will be prefixed by "Kernel panic - not syncing: ", > so another "Invoking panic" message seems somewhat redundant. > > Thanks, > > Lukas Ok, good idea. Regards, Terry