Tested-by: Duc Dang <ducdang@xxxxxxxxxx> On Tue, Aug 27, 2024 at 4:57 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > After a device reset, delays are required before the device can > successfully complete config accesses. PCIe r6.0, sec 6.6, specifies some > delays required before software can perform config accesses. Devices that > require more time after those delays may respond to config accesses with > Configuration Request Retry Status (RRS) completions. > > Callers of pci_dev_wait() are responsible for delays until the device can > respond to config accesses. pci_dev_wait() waits any additional time until > the device can successfully complete config accesses. > > Reading config space of devices that are not present or not ready typically > returns ~0 (PCI_ERROR_RESPONSE). Previously we polled the Command register > until we got a value other than ~0. This is sometimes a problem because > Root Complex handling of RRS completions may include several retries and > implementation-specific behavior that is invisible to software (see sec > 2.3.2), so the exponential backoff in pci_dev_wait() may not work as > intended. > > Linux enables Configuration RRS Software Visibility on all Root Ports that > support it. If it is enabled, read the Vendor ID instead of the Command > register. RRS completions cause immediate return of the 0x0001 reserved > Vendor ID value, so the pci_dev_wait() backoff works correctly. > > When a read of Vendor ID eventually completes successfully by returning a > non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be > initialized and ready to respond to config requests. > > For conventional PCI devices or devices below Root Ports that don't support > Configuration RRS Software Visibility, poll the Command register as before. > > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > --- > drivers/pci/pci.c | 41 ++++++++++++++++++++++++++++------------- > drivers/pci/pci.h | 5 +++++ > drivers/pci/probe.c | 9 +++------ > include/linux/pci.h | 1 + > 4 files changed, 37 insertions(+), 19 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index e3a49f66982d..fc2ecb7fe081 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1283,7 +1283,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) > { > int delay = 1; > bool retrain = false; > - struct pci_dev *bridge; > + struct pci_dev *root, *bridge; > + > + root = pcie_find_root_port(dev); > > if (pci_is_pcie(dev)) { > bridge = pci_upstream_bridge(dev); > @@ -1292,16 +1294,23 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) > } > > /* > - * After reset, the device should not silently discard config > - * requests, but it may still indicate that it needs more time by > - * responding to them with CRS completions. The Root Port will > - * generally synthesize ~0 (PCI_ERROR_RESPONSE) data to complete > - * the read (except when CRS SV is enabled and the read was for the > - * Vendor ID; in that case it synthesizes 0x0001 data). > + * The caller has already waited long enough after a reset that the > + * device should respond to config requests, but it may respond > + * with Request Retry Status (RRS) if it needs more time to > + * initialize. > * > - * Wait for the device to return a non-CRS completion. Read the > - * Command register instead of Vendor ID so we don't have to > - * contend with the CRS SV value. > + * If the device is below a Root Port with Configuration RRS > + * Software Visibility enabled, reading the Vendor ID returns a > + * special data value if the device responded with RRS. Read the > + * Vendor ID until we get non-RRS status. > + * > + * If there's no Root Port or Configuration RRS Software Visibility > + * is not enabled, the device may still respond with RRS, but > + * hardware may retry the config request. If no retries receive > + * Successful Completion, hardware generally synthesizes ~0 > + * (PCI_ERROR_RESPONSE) data to complete the read. Reading Vendor > + * ID for VFs and non-existent devices also returns ~0, so read the > + * Command register until it returns something other than ~0. > */ > for (;;) { > u32 id; > @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) > return -ENOTTY; > } > > - pci_read_config_dword(dev, PCI_COMMAND, &id); > - if (!PCI_POSSIBLE_ERROR(id)) > - break; > + if (root && root->config_crs_sv) { > + pci_read_config_dword(dev, PCI_VENDOR_ID, &id); > + if (!pci_bus_crs_vendor_id(id)) > + break; > + } else { > + pci_read_config_dword(dev, PCI_COMMAND, &id); > + if (!PCI_POSSIBLE_ERROR(id)) > + break; > + } > > if (delay > timeout) { > pci_warn(dev, "not ready %dms after %s; giving up\n", > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 79c8398f3938..fa1997bc2667 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -139,6 +139,11 @@ bool pci_bridge_d3_possible(struct pci_dev *dev); > void pci_bridge_d3_update(struct pci_dev *dev); > int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type); > > +static inline bool pci_bus_crs_vendor_id(u32 l) > +{ > + return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG; > +} > + > static inline void pci_wakeup_event(struct pci_dev *dev) > { > /* Wait 100 ms before the system can be put into a sleep state. */ > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index b14b9876c030..b1615da9eb6b 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1209,9 +1209,11 @@ static void pci_enable_crs(struct pci_dev *pdev) > > /* Enable CRS Software Visibility if supported */ > pcie_capability_read_word(pdev, PCI_EXP_RTCAP, &root_cap); > - if (root_cap & PCI_EXP_RTCAP_CRSVIS) > + if (root_cap & PCI_EXP_RTCAP_CRSVIS) { > pcie_capability_set_word(pdev, PCI_EXP_RTCTL, > PCI_EXP_RTCTL_CRSSVE); > + pdev->config_crs_sv = 1; > + } > } > > static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus, > @@ -2343,11 +2345,6 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus) > } > EXPORT_SYMBOL(pci_alloc_dev); > > -static bool pci_bus_crs_vendor_id(u32 l) > -{ > - return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG; > -} > - > static bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, > int timeout) > { > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 4cf89a4b4cbc..121d8d94d6d0 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -371,6 +371,7 @@ struct pci_dev { > can be generated */ > unsigned int pme_poll:1; /* Poll device's PME status bit */ > unsigned int pinned:1; /* Whether this dev is pinned */ > + unsigned int config_crs_sv:1; /* Config CRS software visibility */ > unsigned int imm_ready:1; /* Supports Immediate Readiness */ > unsigned int d1_support:1; /* Low power state D1 is supported */ > unsigned int d2_support:1; /* Low power state D2 is supported */ > -- > 2.34.1 >