Hi Bjorn
commit 15a6711649915ca3e9d1086dc88ff4b616b99aac Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> Date: Tue Oct 9 17:25:25 2018 -0500 PCI/AER: Enable reporting for ports enumerated after AER driver registration Previously we enabled AER error reporting only for Switch Ports that were enumerated prior to registering the AER service driver. Switch Ports enumerated after AER driver registration were left with error reporting disabled. A common order, which works correctly, is that we enumerate devices before registering portdrv and the AER driver: - Enumerate all the devices at boot-time - Register portdrv and bind it to all Root Ports and Switch Ports, which disables error reporting for these Ports - Register AER service driver and bind it to all Root Ports, which enables error reporting for the Root Ports and any Switch Ports below them But if we enumerate devices *after* registering portdrv and the AER driver, e.g., if a host bridge driver is loaded as a module, error reporting is not enabled correctly: - Register portdrv and AER driver (this happens at boot-time) - Enumerate a Root Port - Bind portdrv to Root Port, disabling its error reporting - Bind AER service driver to Root Port, enabling error reporting for it and its children (none, since we haven't enumerated them yet) - Enumerate Switch Port below the Root Port - Bind portdrv to Switch Port, disabling its error reporting - AER service driver doesn't bind to Switch Ports, so error reporting remains disabled Hot-adding a Switch fails similarly: error reporting is enabled correctly for the Root Port, but when the Switch is enumerated, the AER service driver doesn't claim it, so there's nothing to enable error reporting for the Switch Ports. Change the AER service driver so it binds to *all* PCIe ports, including Switch Upstream and Downstream Ports. For Switch Ports, enable AER error reporting. Link: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@xxxxxxxxx Based-on-patch-by: Jon Derrick <jonathan.derrick@xxxxxxxxx> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 90b53abf621d..fe6c16461367 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1316,12 +1316,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc) pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, ®32); pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32); - /* - * Enable error reporting for the root port device and downstream port - * devices. - */ - set_downstream_devices_error_reporting(pdev, true); -
Delete the code will also disable error reporting for the root port as the portdrv to Root Port has disabled its error reporting, so need to enable enable error reporting for the root port. +pci_enable_pcie_error_reporting(pdev); The patch looks good to me except this. Thanks Dongdong.
/* Enable Root Port's interrupt in response to error messages */ pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, ®32); reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; @@ -1378,10 +1372,17 @@ static void aer_remove(struct pcie_device *dev) */ static int aer_probe(struct pcie_device *dev) { + struct pci_dev *pdev = dev->port; + int type = pci_pcie_type(pdev); int status; struct aer_rpc *rpc; struct device *device = &dev->device; + if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) { + pci_enable_pcie_error_reporting(pdev); + return 0; + } + rpc = devm_kzalloc(device, sizeof(struct aer_rpc), GFP_KERNEL); if (!rpc) { dev_printk(KERN_DEBUG, device, "alloc AER rpc failed\n"); @@ -1439,7 +1440,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev) static struct pcie_port_service_driver aerdriver = { .name = "aer", - .port_type = PCI_EXP_TYPE_ROOT_PORT, + .port_type = PCIE_ANY_PORT, .service = PCIE_PORT_SERVICE_AER, .probe = aer_probe, .