On Fri, May 20, 2022 at 2:41 PM Lukas Wunner <lukas@xxxxxxxxx> wrote: > > On Wed, May 18, 2022 at 07:54:32PM +0800, Sheng Bi wrote: > > +static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout) > > +{ > > + struct pci_dev *dev; > > + int delay = 0; > > + > > + if (!bridge->subordinate || list_empty(&bridge->subordinate->devices)) > > + return 0; > > + > > + list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) { > > + while (!pci_device_is_present(dev)) { > > + if (delay > timeout) { > > + pci_warn(dev, "not ready %dms after secondary bus reset; giving up\n", > > + delay); > > + return -ENOTTY; > > + } > > + > > + msleep(20); > > + delay += 20; > > + } > > + > > + if (delay > 1000) > > + pci_info(dev, "ready %dms after secondary bus reset\n", > > + delay); > > + } > > + > > + return 0; > > +} > > An alternative approach you may want to consider is to call > pci_dev_wait() in the list_for_each_entry loop, but instead of > passing it a constant timeout you'd pass the remaining time. > > Get the current time before and after each pci_dev_wait() call > from "jiffies", calculate the difference, convert to msecs with > jiffies_to_msecs() and subtract from the "timeout" parameter > passed in by the caller, then simply pass "timeout" to each > pci_dev_wait() call. Thanks for your proposal, which can avoid doing duplicated things as pci_dev_wait(). If so, I also want to align the polling things mentioned in the question from Alex, since pci_dev_wait() is also used for reset functions other than SBR. To Bjorn, Alex, Lucas, how do you think if we need to change the polling in pci_dev_wait() to 20ms intervals, or keep binary exponential back-off with probable unexpected extra timeout delay. > > As a side note, traversing the bus list normally requires > holding the pci_bus_sem for reading. But it's probably unlikely > that devices are added/removed concurrently to a bus reset > and we're doing it wrong pretty much everywhere in the > PCI reset code, so... Yeah... I think that is why I saw different coding there. I would prefer a separate thread for estimating which ones are real risks. > > (I fixed up one of the reset functions with 10791141a6cf, > but plenty of others remain...) > > Thanks, > > Lukas