Hi, On Sun, Apr 16, 2023 at 09:48:46AM +0200, Lukas Wunner wrote: > On Fri, Apr 14, 2023 at 01:11:47PM +0300, Mika Westerberg wrote: > > To summarize the v4 patch would look something like below. Only compile > > tested but I will run real testing later today. I think it now includes > > the 1s optimization and also checking of the active link reporting > > support for the devices behind slow links. Let me know is I missed > > something. > > The patch seems to be based on a branch which has the v3 patch applied > instead of on pci.git/reset, and that makes it slightly more difficult > to review, but from a first glance it LGTM. Oops :( I forgot to rebase it. Sorry about that. > > > It is getting rather complex unfortunately :( > > I disagree. :) Basically the Gen1/Gen2 situation becomes a special case > because it has specific timing requirements (need to observe a delay > before accessing the Secondary Bus, instead of waiting for the link) > and it doesn't necessarily support link active reporting. So special > casing it seems fair to me. OK. > > - * However, 100 ms is the minimum and the PCIe spec says the > > - * software must allow at least 1s before it can determine that the > > - * device that did not respond is a broken device. There is > > - * evidence that 100 ms is not always enough, for example certain > > - * Titan Ridge xHCI controller does not always respond to > > - * configuration requests if we only wait for 100 ms (see > > - * https://bugzilla.kernel.org/show_bug.cgi?id=203885). > > + * However, 100 ms is the minimum and the PCIe spec says the software > > + * must allow at least 1s before it can determine that the device that > > + * did not respond is a broken device. Also device can take longer than > > + * that to respond if it indicates so through Request Retry Status > > + * completions. > > It *might* be worth avoiding the rewrapping of the first 3 lines to > make the patch smaller, your choice. Makes sense, will do. > > + > > + /* > > + * If the port supports active link reporting we now check one > > + * more time if the link is active and if not bail out early > > + * with the assumption that the device is not present anymore. > > + */ > > Nit: Drop the "one more time" because it seems this is actually the > first time the link is checked. Sure. > Somewhat tangentially, I note that pcie_wait_for_link_delay() has a > "if (!pdev->link_active_reporting)" branch right at its top, however > pci_bridge_wait_for_secondary_bus() only calls the function in the > Gen3+ (> 5 GT/s) case, which always supports link active reporting. > > Thus the branch is never taken when pcie_wait_for_link_delay() is called > from pci_bridge_wait_for_secondary_bus(). There's only one other caller, > pcie_wait_for_link(). So moving the "if (!pdev->link_active_reporting)" > branch to pcie_wait_for_link() *might* make the code more readable. > Just a thought. Indeed, however moving it would make the two functions behave differently where as before pcie_wait_for_link() is just a common wrapper on top fo pcie_wait_for_link_delay(). At least if we do this change, the documentation should be updated too and possibly rename pcie_wait_for_link_delay() to __pcie_wait_for_link_delay() or so to make sure it is something internal to that file.