Terry Bowman wrote: > pci_driver::cxl_err_handlers are not currently assigned handler callbacks. > The handlers can't be set in the pci_driver static definition because the > CXL PCIe Port devices are bound to the portdrv driver which is not CXL > driver aware. > > Add cxl_assign_port_error_handlers() in the cxl_core module. This > function will assign the default handlers for a CXL PCIe Port device. > > When the CXL Port (cxl_port or cxl_dport) is destroyed the device's > pci_driver::cxl_err_handlers must be set to NULL indicating they should no > longer be used. > > Create cxl_clear_port_error_handlers() and register it to be called > when the CXL Port device (cxl_port or cxl_dport) is destroyed. This is another complication that naturally goes away with cxl_error_handlers are instances that get attached to 'struct cxl_driver' instances rather tha 'struct pci_driver' instances. > > Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx> > Reviewed-by: Gregory Price <gourry@xxxxxxxxxx> > --- > drivers/cxl/core/pci.c | 59 ++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 57 insertions(+), 2 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index f154dcf6dfda..03ae21a944e0 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -860,8 +860,39 @@ static pci_ers_result_t cxl_port_error_detected(struct pci_dev *pdev) > return __cxl_handle_ras(dev, &pdev->dev, ras_base); > } > > +static const struct cxl_error_handlers cxl_port_error_handlers = { > + .error_detected = cxl_port_error_detected, > + .cor_error_detected = cxl_port_cor_error_detected, > +}; > + > +static void cxl_assign_port_error_handlers(struct pci_dev *pdev) > +{ > + struct pci_driver *pdrv; > + > + if (!pdev || !pdev->driver || !get_device(&pdev->dev)) > + return; > + > + pdrv = pdev->driver; > + pdrv->cxl_err_handler = &cxl_port_error_handlers; Nothing is holding the @pdev device_lock(), so @pdev->driver may go NULL immediately after reading it. Also, it is possible for a 'struct cxl_port' to exist even though its uport_dev (pci_dev) is not attached to a driver. This would seem to result in unpredictable behavior from one kernel to the next as the PCIe portdrv situation evolves. Lastly, I do not like the precedent of not being able to read a 'struct pci_driver' template and be assured that it captures all possible error handlers, or even worse, this unceremoniously overrides a PCI driver that thinks it knows what the CXL error handlers should be.