On Tue, Oct 29, 2019 at 03:54:56PM -0500, Bjorn Helgaas wrote: > On Fri, Oct 04, 2019 at 03:39:47PM +0300, Mika Westerberg wrote: > > Currently Linux does not follow PCIe spec regarding the required delays > > after reset. A concrete example is a Thunderbolt add-in-card that > > consists of a PCIe switch and two PCIe endpoints: > > ... > > > @@ -1025,15 +1025,11 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state) > > if (state == PCI_D0) { > > pci_platform_power_transition(dev, PCI_D0); > > /* > > - * Mandatory power management transition delays, see > > - * PCI Express Base Specification Revision 2.0 Section > > - * 6.6.1: Conventional Reset. Do not delay for > > - * devices powered on/off by corresponding bridge, > > - * because have already delayed for the bridge. > > + * Mandatory power management transition delays are handled > > + * in pci_pm_runtime_resume() of the corresponding > > + * downstream/root port. > > */ > > if (dev->runtime_d3cold) { > > - if (dev->d3cold_delay && !dev->imm_ready) > > - msleep(dev->d3cold_delay); > > This removes the only use of d3cold_delay. I assume that's > intentional? If we no longer need it, we might as well remove it from > the pci_dev and remove the places that set it. It'd be nice if that > could be a separate patch, even if we waited a little longer than > necessary at that one bisection point. Yes, it is intentional. In the previous version I had function pcie_get_downstream_delay() that used both d3cold_delay and imm_ready to calculate the downstream device delay but you said: I'm not sold on the idea that this delay depends on what's *below* the bridge. We're using sec 6.6.1 to justify the delay, and that section doesn't say anything about downstream devices. So I dropped it and use 100 ms instead. Now that you mention, I think if we want to continue support that _DSM, we should still take d3cold_delay into account in this patch. There is also one driver (drivers/mfd/intel-lpss-pci.c) that sets it to 0. > It also removes one of the three uses of imm_ready, leaving only the > two in FLR. I suspect there are other places we should use imm_ready, > e.g., transitions to/from D1 and D2, but that would be beyond the > scope of this patch. Right, I think imm_ready does not apply here. If I understand correctly it is exactly for D1, D2 and D3hot transitions which we should take into account in pci_dev_d3_sleep() (which we don't do right now). > > + /* > > + * For PCIe downstream and root ports that do not support speeds > > + * greater than 5 GT/s need to wait minimum 100 ms. For higher > > + * speeds (gen3) we need to wait first for the data link layer to > > + * become active. > > + * > > + * However, 100 ms is the minimum and the PCIe spec says the > > + * software must allow at least 1s before it can determine that the > > + * device that did not respond is a broken device. There is > > + * evidence that 100 ms is not always enough, for example certain > > + * Titan Ridge xHCI controller does not always respond to > > + * configuration requests if we only wait for 100 ms (see > > + * https://bugzilla.kernel.org/show_bug.cgi?id=203885). > > + * > > + * Therefore we wait for 100 ms and check for the device presence. > > + * If it is still not present give it an additional 100 ms. > > + */ > > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT && > > + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM) > > + return; > > Shouldn't this be: > > if (!pcie_downstream_port(dev)) > return > > so we include PCI/PCI-X to PCIe bridges? Yes, I'll change it in v3.