On Thu, Dec 3, 2020 at 6:58 PM Thierry Reding <thierry.reding@xxxxxxxxx> wrote: > > From: Thierry Reding <treding@xxxxxxxxxx> > > Device drivers usually depend on the fact that the devices that they > control are suspended in the same order that they were probed in. In > most cases this is already guaranteed via deferred probe. > > However, there's one case where this can still break: if a device is > instantiated before a dependency (for example if it appears before the > dependency in device tree) but gets probed only after the dependency is > probed. Instantiation order would cause the dependency to get probed > later, in which case probe of the original device would be deferred and > the suspend/resume queue would get reordered properly. However, if the > dependency is provided by a built-in driver and the device depending on > that driver is controlled by a loadable module, which may only get > loaded after the root filesystem has become available, we can be faced > with a situation where the probe order ends up being different from the > suspend/resume order. > > One example where this happens is on Tegra186, where the ACONNECT is > listed very early in device tree (sorted by unit-address) and depends on > BPMP (listed very late because it has no unit-address) for power domains > and clocks/resets. If the ACONNECT driver is built-in, there is no > problem because it will be probed before BPMP, causing a probe deferral > and that in turn reorders the suspend/resume queue. However, if built as > a module, it will end up being probed after BPMP, and therefore not > result in a probe deferral, and therefore the suspend/resume queue will > stay in the instantiation order. This in turn causes problems because > ACONNECT will be resumed before BPMP, which will result in a hang > because the ACONNECT's power domain cannot be powered on as long as the > BPMP is still suspended. > > Fix this by always reordering devices on successful probe. This ensures > that the suspend/resume queue is always in probe order and hence meets > the natural expectations of drivers vs. their dependencies. > > Reported-by: Jonathan Hunter <jonathanh@xxxxxxxxxx> > Signed-off-by: Thierry Reding <treding@xxxxxxxxxx> Saravana had submitted a very similar patch (I don't have a pointer to that one though) and I was against it at that time due to overhead-related concerns. There still are some, but maybe that doesn't matter in practice. Also, I kind of expect this to blow up somewhere, but since I have no examples ready from the top of my head, I think let's try and see, so: Acked-by: Rafael. J. Wysocki <rafael@xxxxxxxxxx> > --- > drivers/base/dd.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index 148e81969e04..cfc079e738bb 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -370,6 +370,13 @@ static void driver_bound(struct device *dev) > > device_pm_check_callbacks(dev); > > + /* > + * Reorder successfully probed devices to the end of the device list. > + * This ensures that suspend/resume order matches probe order, which > + * is usually what drivers rely on. > + */ > + device_pm_move_to_tail(dev); > + > /* > * Make sure the device is no longer in one of the deferred lists and > * kick off retrying all pending devices > -- > 2.29.2 >