On Thu, Dec 3, 2020 at 10:17 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > On Thu, Dec 3, 2020 at 6:58 PM Thierry Reding <thierry.reding@xxxxxxxxx> wrote: > > > > From: Thierry Reding <treding@xxxxxxxxxx> > > > > Device drivers usually depend on the fact that the devices that they > > control are suspended in the same order that they were probed in. In > > most cases this is already guaranteed via deferred probe. > > > > However, there's one case where this can still break: if a device is > > instantiated before a dependency (for example if it appears before the > > dependency in device tree) but gets probed only after the dependency is > > probed. Instantiation order would cause the dependency to get probed > > later, in which case probe of the original device would be deferred and > > the suspend/resume queue would get reordered properly. However, if the > > dependency is provided by a built-in driver and the device depending on > > that driver is controlled by a loadable module, which may only get > > loaded after the root filesystem has become available, we can be faced > > with a situation where the probe order ends up being different from the > > suspend/resume order. > > > > One example where this happens is on Tegra186, where the ACONNECT is > > listed very early in device tree (sorted by unit-address) and depends on > > BPMP (listed very late because it has no unit-address) for power domains > > and clocks/resets. If the ACONNECT driver is built-in, there is no > > problem because it will be probed before BPMP, causing a probe deferral > > and that in turn reorders the suspend/resume queue. However, if built as > > a module, it will end up being probed after BPMP, and therefore not > > result in a probe deferral, and therefore the suspend/resume queue will > > stay in the instantiation order. This in turn causes problems because > > ACONNECT will be resumed before BPMP, which will result in a hang > > because the ACONNECT's power domain cannot be powered on as long as the > > BPMP is still suspended. > > > > Fix this by always reordering devices on successful probe. This ensures > > that the suspend/resume queue is always in probe order and hence meets > > the natural expectations of drivers vs. their dependencies. > > > > Reported-by: Jonathan Hunter <jonathanh@xxxxxxxxxx> > > Signed-off-by: Thierry Reding <treding@xxxxxxxxxx> > > Saravana had submitted a very similar patch (I don't have a pointer to > that one though) and I was against it at that time due to > overhead-related concerns. There still are some, but maybe that > doesn't matter in practice. Yeah, it's a very similar patch but I also suggested deleting the reorder done in the deferred probe code (I'm pretty sure we can drop it). Here's the thread: https://lore.kernel.org/lkml/20200625032430.152447-1-saravanak@xxxxxxxxxx/ Btw, I've been wondering about this recently. Do we even need device_pm_move_to_tail() to do the recursive thing once we do "add device to end of list when added" + "move probed devices to the end after probe" thing here? Doesn't this guarantee that none of the consumers can come before the supplier in the dpm list? -Saravana > > Also, I kind of expect this to blow up somewhere, but since I have no > examples ready from the top of my head, I think let's try and see, so: > > Acked-by: Rafael. J. Wysocki <rafael@xxxxxxxxxx> > > > --- > > drivers/base/dd.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > > index 148e81969e04..cfc079e738bb 100644 > > --- a/drivers/base/dd.c > > +++ b/drivers/base/dd.c > > @@ -370,6 +370,13 @@ static void driver_bound(struct device *dev) > > > > device_pm_check_callbacks(dev); > > > > + /* > > + * Reorder successfully probed devices to the end of the device list. > > + * This ensures that suspend/resume order matches probe order, which > > + * is usually what drivers rely on. > > + */ > > + device_pm_move_to_tail(dev); > > + > > /* > > * Make sure the device is no longer in one of the deferred lists and > > * kick off retrying all pending devices > > -- > > 2.29.2 > >