Re: [RFC PATCH] PCI/portdrv: Fix switch devctrl error report enable.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 13, 2017 at 05:02:00PM +0800, Dongdong Liu wrote:
> Current code has a bug, switch upstream/downstream port error report
> is disabled.
> 
> DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> -[0000:00]-+-00.0-[01]--
>            \-08.0-[02-07]----00.0-[03-07]--+-04.0-[04]--
>                                            +-05.0-[05]--
>                                            +-08.0-[06]--
>                                            \-09.0-[07]--
> portdrv_pci.c
> pcie_portdrv_probe
> 	--->pcie_port_device_register
> 		--->get_port_device_capability
> 			--->pci_disable_pcie_error_reporting
> 
> aerdrv.c
> aer_probe
> 	--->aer_enable_rootport
> 		--->set_downstream_devices_error_reporting
> 
> But current code RP AER driver probe is before switch UP/DP portdrv probe.
> This is a RFC PATCH for discussing.

OK, I think I see the issue you're pointing out.  aer_probe() only
binds to root ports (".port_type = PCI_EXP_TYPE_ROOT_PORT" in the
aerdriver struct).

We call aer_probe() for a root port, and it enables AER reporting for
the root port and any downstream devices:

  aer_probe(root port)                        # only binds to root ports
    aer_enable_rootport
      set_downstream_devices_error_reporting(root, true)
        set_device_error_reporting(root, true)
          pci_enable_pcie_error_reporting
            pcie_capability_set_word(root, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS)
        pci_walk_bus(root->subordinate set_device_error_reporting, true)
          set_device_error_reporting(dev, true)
            pci_enable_pcie_error_reporting
              pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS)

We later call pcie_portdrv_probe() for every downstream bridge (it
matches PCI_CLASS_BRIDGE_PCI devices, then discards any non-PCIe
devices), and it *disables* AER reporting:

  pcie_portdrv_probe(switch port)
    pcie_port_device_register
      get_port_device_capability
        pci_disable_pcie_error_reporting
          pcie_capability_clear_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS)

The result is that we first enable AER for the downstream switch
ports, then we disable it again.

I agree, that looks like a problem.  I think aer_enable_rootport() is
broken because it binds to one device and it also configures other
downstream devices.  That opens the door to concurrency issues and
makes it really hard to ensure that hotplug works correctly.

I think the aer_probe() path should only touch the device it is
binding, i.e., it should not use pci_walk_bus().  If we need to
configure another device, that should be done in the enumeration path
for *that device*.

> Signed-off-by: Dongdong Liu <liudongdong3@xxxxxxxxxx>
> ---
>  drivers/pci/pcie/portdrv_core.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index 313a21d..6bb89ce 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -261,7 +261,9 @@ static int get_port_device_capability(struct pci_dev *dev)
>  		 * Disable AER on this port in case it's been enabled by the
>  		 * BIOS (the AER service driver will enable it when necessary).
>  		 */
> -		pci_disable_pcie_error_reporting(dev);
> +		if((pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM) &&
> +		   (pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM))
> +			pci_disable_pcie_error_reporting(dev);
>  	}
>  	/* VC support */
>  	if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_VC))
> -- 
> 1.9.1
> 



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux