Re: [Question] Unsupported Request During PCI Enumeration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 01, 2017 at 05:30:48PM +0800, Dongdong Liu wrote:
> 
> 在 2017/12/1 7:15, Bjorn Helgaas 写道:
> >On Thu, Nov 30, 2017 at 08:47:38PM +0800, Dongdong Liu wrote:
> >>
> >>>I do not think drivers should be calling
> >>>pci_enable_pcie_error_reporting() directly.  And I don't think we
> >>>should depend on _HPX.  _HPX is intended for *platform-specific*
> >>>things, and I don't think this is platform-specific.  And of course,
> >>>_HPX only works on ACPI systems, and we want this to work everywhere.
> >>>
> >>>So I think we need a more comprehensive generic solution in Linux.  I
> >>>think we need several pieces:
> >>>
> >>> - Disable UR reporting during enumeration, similar to how we disable
> >>>   Master Abort reporting.
> >>>
> >>> - Restore original UR reporting setting after enumeration.
> >>>
> >>> - Clear any logged UR errors after enumeration.
> >>
> >>Agree.
> >>
> >>>
> >>> - Enable UR reporting as part of configuring every device, e.g., in
> >>>   pci_configure_device() or pci_init_capabilities().
> >>
> >>Enable UR reporting in pci_configure_device() or pci_init_capabilities() is
> >>not ok. It will still report UR as it is still in enumeration.
> >
> >True.  For bridges, it needs to be after we've enumerated children of
> >the bridge.
> >
> >>>I don't know whether we should enable UR reporting if AER is not
> >>>present; we'll have to think about that.  We may also have to
> >>>restructure how the AER driver works, because I think it currently
> >>>comes into play *after* we call pci_init_capabilities().  There are
> >>>probably some ordering issues there.
> >>
> >>I write a sample code as below, it can avoid report UR during
> >>enumeration. What do you think about the code ?
> >
> >Can you do it the same place we disable and restore Master-Abort Mode,
> >in pci_scan_bridge_extend()?  That's done for the same reason, so it
> >would be nice to keep it together.
> >
> >Maybe we could have a "pci_disable_enumeration_errors()" that would
> >disable both Master-Abort Mode and UR reporting, and a corresponding
> >"pci_restore_enumeration_errors()" that would clear any logged errors
> >and restore the original settings?
> 
> Set devctl to disable UR in pci_scan_bridge_extend() can resolve
> bridge devices UR reporting issue. But for EP devices may still
> reprot UR, like non-ARI EP devices. So it's better to disable UR
> reporting in pci_device_add() after calling
> pci_configure_device(dev) and restore in pci_bus_add_device().  As
> in pci_configure_device() may parse _HPX to enable UR, disable UR
> reporting should call after pci_configure_device().

Help me understand this issue with Endpoints and how ARI is involved.

Is this an issue with multi-function devices, where function 0 exists
and then we try to read the Vendor ID of function 1, 2, ..., until one
fails to respond?  And when one fails to respond, the Endpoint logs a
UR error?

And I guess the ARI connection is that with ARI, we use the ARI
Capability to find the next function, so we should never try to read
the Vendor ID of a non-existent function?

pci_device_add() and pci_bus_add_device() don't have any obvious
connection to enumeration, so I don't really like those places.  If
I'm understanding the issue correctly, I'd rather do the error
disable/restore in pci_scan_slot(), where it's right next to the
enumeration of additional functions.

> >>diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> >>index bc56cf1..1a46026 100644
> >>--- a/drivers/pci/bus.c
> >>+++ b/drivers/pci/bus.c
> >>@@ -14,6 +14,7 @@
> >> #include <linux/ioport.h>
> >> #include <linux/proc_fs.h>
> >> #include <linux/slab.h>
> >>+#include <linux/aer.h>
> >>
> >> #include "pci.h"
> >>
> >>@@ -311,6 +312,16 @@ void __weak pcibios_bus_add_device(struct pci_dev *pdev) { }
> >> void pci_bus_add_device(struct pci_dev *dev)
> >> {
> >>        int retval;
> >>+       u16 reg16;
> >>+
> >>+       /* Clear PCIe Capability's Device Status */
> >>+       pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
> >>+       pcie_capability_write_word(dev, PCI_EXP_DEVSTA, reg16);
> >>+       pci_cleanup_aer_error_status_regs(dev);
> >>+
> >>+       /* Restore UR reporting */
> >>+       if (dev->devctl_ur)
> >>+               pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_URRE);
> >>
> >>        /*
> >>         * Can not put in pci_device_add yet because resources
> >>diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >>index 14e0ea1..7b34083 100644
> >>--- a/drivers/pci/probe.c
> >>+++ b/drivers/pci/probe.c
> >>@@ -2119,9 +2119,21 @@ static void pci_set_msi_domain(struct pci_dev *dev)
> >> void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
> >> {
> >>        int ret;
> >>+       u16 reg16;
> >>
> >>        pci_configure_device(dev);
> >>
> >>+       pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &reg16);
> >>+       dev->devctl_ur = reg16 & PCI_EXP_DEVCTL_URRE;
> >>+       if (dev->devctl_ur) {
> >>+               /*
> >>+                * Turn off UR reporting, later restore UR reporting
> >>+                * after enumeration.
> >>+                */
> >>+               reg16 = reg16 & (~PCI_EXP_DEVCTL_URRE);
> >>+               pcie_capability_write_word(dev, PCI_EXP_DEVCTL, reg16);
> >>+       }
> >>+
> >>        device_initialize(&dev->dev);
> >>        dev->dev.release = pci_release_dev;
> >>
> >>diff --git a/include/linux/pci.h b/include/linux/pci.h
> >>index ae8d2b4..c2a90f1 100644
> >>--- a/include/linux/pci.h
> >>+++ b/include/linux/pci.h
> >>@@ -440,6 +440,7 @@ struct pci_dev {
> >>        char *driver_override; /* Driver name to force a match */
> >>
> >>        unsigned long priv_flags; /* Private flags for the pci driver */
> >>+       u16 devctl_ur;
> >> };



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux