On Thu, 30 Jan 2025 at 14:19, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > On Thu, Jan 30, 2025 at 12:11 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > On Wed, 29 Jan 2025 at 17:58, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > > > > > On Wed, Jan 29, 2025 at 5:42 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > > > > > On Wed, 29 Jan 2025 at 16:55, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > > > > > > > > > On Wed, Jan 29, 2025 at 12:53 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Tue, 28 Jan 2025 at 20:24, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > > > > > > > > > > > > > Commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the > > > > > > > resume phase") overlooked the case in which the parent of a device with > > > > > > > DPM_FLAG_SMART_SUSPEND set did not use that flag and could be runtime- > > > > > > > suspended before a transition into a system-wide sleep state. In that > > > > > > > case, if the child is resumed during the subsequent transition from > > > > > > > that state into the working state, its runtime PM status will be set to > > > > > > > RPM_ACTIVE, but the runtime PM status of the parent will not be updated > > > > > > > accordingly, even though the parent will be resumed too, because of the > > > > > > > dev_pm_skip_suspend() check in device_resume_noirq(). > > > > > > > > > > > > > > Address this problem by tracking the need to set the runtime PM status > > > > > > > to RPM_ACTIVE during system-wide resume transitions for devices with > > > > > > > DPM_FLAG_SMART_SUSPEND set and all of the devices depended on by them. > > > > > > > > > > > > > > Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase") > > > > > > > Closes: https://lore.kernel.org/linux-pm/Z30p2Etwf3F2AUvD@xxxxxxxxxxxxxxxxxxxx/ > > > > > > > Reported-by: Johan Hovold <johan@xxxxxxxxxx> > > > > > > > Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> > > > > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > > > > > > --- > > > > > > > drivers/base/power/main.c | 29 ++++++++++++++++++++--------- > > > > > > > include/linux/pm.h | 1 + > > > > > > > 2 files changed, 21 insertions(+), 9 deletions(-) > > > > > > > > > > > > > > --- a/drivers/base/power/main.c > > > > > > > +++ b/drivers/base/power/main.c > > > > > > > @@ -656,13 +656,15 @@ > > > > > > > * so change its status accordingly. > > > > > > > * > > > > > > > * Otherwise, the device is going to be resumed, so set its PM-runtime > > > > > > > - * status to "active", but do that only if DPM_FLAG_SMART_SUSPEND is set > > > > > > > - * to avoid confusing drivers that don't use it. > > > > > > > + * status to "active" unless its power.set_active flag is clear, in > > > > > > > + * which case it is not necessary to update its PM-runtime status. > > > > > > > */ > > > > > > > - if (skip_resume) > > > > > > > + if (skip_resume) { > > > > > > > pm_runtime_set_suspended(dev); > > > > > > > - else if (dev_pm_skip_suspend(dev)) > > > > > > > + } else if (dev->power.set_active) { > > > > > > > pm_runtime_set_active(dev); > > > > > > > + dev->power.set_active = false; > > > > > > > + } > > > > > > > > > > > > > > if (dev->pm_domain) { > > > > > > > info = "noirq power domain "; > > > > > > > @@ -1189,18 +1191,24 @@ > > > > > > > return PMSG_ON; > > > > > > > } > > > > > > > > > > > > > > -static void dpm_superior_set_must_resume(struct device *dev) > > > > > > > +static void dpm_superior_set_must_resume(struct device *dev, bool set_active) > > > > > > > { > > > > > > > struct device_link *link; > > > > > > > int idx; > > > > > > > > > > > > > > - if (dev->parent) > > > > > > > + if (dev->parent) { > > > > > > > dev->parent->power.must_resume = true; > > > > > > > + if (set_active) > > > > > > > + dev->parent->power.set_active = true; > > > > > > > + } > > > > > > > > > > > > > > idx = device_links_read_lock(); > > > > > > > > > > > > > > - list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) > > > > > > > + list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) { > > > > > > > link->supplier->power.must_resume = true; > > > > > > > + if (set_active) > > > > > > > + link->supplier->power.set_active = true; > > > > > > > > > > > > If I understand correctly, the suppliers are already handled when the > > > > > > pm_runtime_set_active() is called for consumers, so the above should > > > > > > not be needed. > > > > > > > > > > It is needed because pm_runtime_set_active() doesn't cause the setting > > > > > to propagate to the parent's/suppliers of the suppliers AFAICS. > > > > > > > > Hmm, even if that sounds reasonable, I don't think it's a good idea as > > > > it may introduce interesting propagation problems between drivers. > > > > > > > > For example, consider that Saravana is trying to enable runtime PM for > > > > fw_devlinks. It would mean synchronization issues for the runtime PM > > > > status, all over the place. > > > > > > What synchronization issues? > > > > Changing the runtime PM status for a parent/supplier that doesn't have > > DPM_FLAG_SMART_SUSPEND, is likely to confuse their drivers. > > I'm not sure why though. > > > You also removed that part of the comment a few lines above, in > > device_resume_noirq(). I am not sure I understand why? > > Not removed, but replaced. > > The set_active flag is only set for devices with > DPM_FLAG_SMART_SUSPEND set and devices depended on by them. Also, it > is only set for devices whose must_resume is set, which for devices > with DPM_FLAG_SMART_SUSPEND means that they literally must be resumed. > Consequently, the devices depended on by them also must be resumed. > > > > > > > > That said, is even consumer/suppliers part of the problem we are > > > > trying to solve? > > > > > > They are in general. > > > > > > It's just that stuff that was runtime-suspended prior to a system-wide > > > suspend may need to be resumed and marked as RPM_ACTIVE during > > > system-wide resume because one of the devices wants/needs to be > > > resumed then. > > > > > > If this turns out to be problematic, the problem will need to be > > > addressed, but for now I'm not seeing why there would be a problem. > > > > > > > > > > > > > > That said, maybe we instead allow parent/child to work in the similar > > > > > > way as for consumer/suppliers, when pm_runtime_set_active() is called > > > > > > for the child. In other words, when pm_runtime_set_active() is called > > > > > > for a child and the parent is runtime PM enabled, let's runtime resume > > > > > > it too, as we do for suppliers. Would that work, you think? > > > > > > > > > > The parent is not runtime-PM enabled when this happens. > > > > > > > > That sounds really weird to me. > > > > > > > > Does that mean that the parent has not been system resumed either? > > > > > > Yes. > > > > > > It hasn't been resumed yet, but it is known that it will be resumed. > > > > > > > If so, isn't that really the root cause for this problem, > > > > > > No, it is not. > > > > > > > or what am I missing? > > > > > > Essentially, what I said above. > > > > > > If a device that was suspended prior to a system-wide suspend > > > wants/needs to be resumed during the subsequent system-wide resume, > > > and it was runtime-PM-enabled before the suspend transition, it needs > > > to (a) be runtime-PM-enabled during the subsequent system-wide resume > > > transition and (b) it also needs to be marked as RPM_ACTIVE because in > > > fact it is not suspended any more. The existing code before the patch > > > takes care of this for the device itself, but not for the devices it > > > depends on which also need to be resumed (which happens) and marked as > > > RPM_ACTIVE (which doesn't happen) and the patch just makes sure that > > > the latter will happen. > > > > Thanks for clarifying! > > > > > > > > Actually, what happens now is that the actual state of the parent > > > during the system-wide resume, right before re-enabling runtime PM for > > > it, does not match its runtime PM status which is still RPM_SUSPENDED. > > > That's what is fixed here and it applies to the parent as well as to > > > all of the other devices depended on by the child and the parent. > > > > Well, unfortunately I don't think it will work to call > > pm_runtime_set_active() for parents/suppliers like this. > > As stated above, if a device with DPM_FLAG_SMART_SUSPEND has > power.must_resume set, it must be resumed. Therefore, all of the > devices depended on by it must be resumed (literally, they need to be > powered up and configured to work). This is already a rule without > the patch. > > Accordingly, they all effectively will be "active" and so their > runtime PM status must reflect this.