On Mon, 27 Jan 2025 16:02:32 +0100 Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > On Mon, 27 Jan 2025 at 13:32, Jonathan Cameron > <Jonathan.Cameron@xxxxxxxxxx> wrote: > > > > On Mon, 27 Jan 2025 11:47:44 +0100 > > Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > > [...] > > > > > > > > > > > Do consider OK to change the order in pm_runtime_disable_action() to get > > > > > > > > rid of these issues, e.g.: > > > > > > > > > > > > > > > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c > > > > > > > > index 2ee45841486b..f27d311d2619 100644 > > > > > > > > --- a/drivers/base/power/runtime.c > > > > > > > > +++ b/drivers/base/power/runtime.c > > > > > > > > @@ -1547,8 +1547,8 @@ EXPORT_SYMBOL_GPL(pm_runtime_enable); > > > > > > > > > > > > > > > > static void pm_runtime_disable_action(void *data) > > > > > > > > { > > > > > > > > - pm_runtime_dont_use_autosuspend(data); > > > > > > > > pm_runtime_disable(data); > > > > > > > > + pm_runtime_dont_use_autosuspend(data); > > > > > > > > } > > > > > > > > > > > > > > > > though I see a rpm_resume() call is still possible though pm_runtime_disable(). > > > > > > > > > > > > > > I am still worried about keeping the device runtime enabled during a > > > > > > > window when we have turned off all resources for the device. Typically > > > > > > > we want to leave the device in a low power state after unbind. > > > > > > > > > > > > > > That said, I would rather just drop the devm_pm_runtime_enable() API > > > > > > > altogether and convert all users of it into > > > > > > > pm_runtime_enable|disable(), similar to what your patch does. > > > > > > > > > > > > That is making a mess of a lot of automated cleanup for a strange > > > > > > runtime pm related path. This is pain a driver should not have > > > > > > to deal with, though I'm not clear what the right solution is! > > > > > > > > > > > > Key is that drivers should not mix devm managed cleanup and not, so > > > > > > that means that anything that happens after runtime pm is enabled > > > > > > has to be torn down manually. One solution to this might be to > > > > > > always enable it late assuming that is safe to do so there is > > > > > > never anything else done after it in the probe path of a driver. > > > > > > > > > > The problem is that runtime PM isn't really comparable to other > > > > > resources that we are managing through devm* functions. > > > > > > > > > > Enabling runtime PM for a device changes the behaviour for how > > > > > power-mgmt is handled for the device. Enabling/disabling of runtime PM > > > > > really needs to be explicitly controlled by the driver for the device. > > > > > > > > I'm sorry to say I'm not yet convinced. > > > > > > Okay, let me try one more time. :-) > > > > +CC Greg as the disagreement here is really a philosophy of what > > devm cleanup is relative to remove. Perhaps Greg or Rafael can > > given some guidance on the intent there. > > > > Mind you I think I found another subsystem working around this > > and in a somewhat more elegant, general way (to my eyes anyway!) > > > > https://elixir.bootlin.com/linux/v6.12.6/source/drivers/i2c/i2c-core-base.c#L630 > > https://lore.kernel.org/all/YFf1GFPephFxC0mC@xxxxxxxxxx/ > > > > +CC Dmitry. > > > > I2C creates an extra devres group and releases it before devm_pm_domain_detach() > > As all devm calls from the driver end up in that group, they are released > > before dev_pm_domain_detach() > > > > How would that address the problem I pointed out with runtime PM > below? This problem isn't limited to attaching/detaching PM domains. It's associated with anything that happens after a driver remove is done. We just disagree on when that remove is finished. There is nothing special about the remove() callback, that is just part of remove process. No magic transition of state that allows new things to happen follows the device driver remove finishing. Sure you can get the remove handling ordering wrong whether devm is in use or not. The trick is almost always to never mix devm and not. Once you need a single bit of manual unwinding stop with the devm and do everything beyond that point by hand (in probe order, before that point in remove order) > > > > > > > > > > > > > > Devm callbacks are explicitly registered by the driver so that they > > > > are unwound in a specific order. Many other parts of driver > > > > registration rely on this ordering. This does not seem different > > > > for runtime PM than anything else. > > > > > > If you compare clocks, for example. It's the driver that is in full > > > control of the clock gating/ungating. When the ->remove() callback > > > runs, the driver typically makes sure that it leaves the clock gated. > > > Then it doesn't really matter when the clock resource gets released. > > > The point is, the driver is in full control of the resource. > > > > Not a good example. devm_clk_get_enabled() does not gate the clock until > > I was not referring to devm_clk_get_enable(), but rather just devm_clk_get(). > > To me devm_clk_get_enable() is another interface that we should avoid. > For example, what if the clock is already gated when the ->remove() > callback runs? Then we need to ungate the clock just to make the > devres path happy so it doesn't gate an already gated clock. And this, > just to save one or two lines of code. If someone is using a clock that is gated by other calls in the driver that then indeed the use of devm_clk_get_enabled() is inappropriate or they turn they do need to enable the clock and turn it off again as you mention which is a mess but often needs doing anyway as we commonly need some clocks at least to put a device into a low power state. > > Don't get me wrong, I certainly like the devm* functions in general, > but it's not a good fit for everything. For anything that a driver otherwise calls in remove() they are a direct equivalent that is just called automatically. (though apparently not quite in platform drivers!) If you have a sequence that is sufficiently complex they may not be a good fit. I just don't think that the sequence in this driver (and many others) is complex. The driver code (with the above change from i2c ported to platform code to fix the specific problem) is a lot simpler before Claudia's v2 to change the handling. 32 lines added to work around this... > > > the devm cleanup. The assumption being that nothing that affects > > it runs between the remove() and devm cleanup. So pretty much identical > > to the runtime pm case. They being that you have to obey ordering so > > that if you need to run something after the clock is disabled then > > you register that callback before you call devm_clk_get_enabled() > > > > > > > > If runtime PM would remain enabled beyond the call to the ->remove() > > > callback, it would mean that the driver's runtime PM callbacks could > > > be called too. For example, userspace via sysfs may at any point > > > decide to runtime resume the device. In other words, we may end up > > > calling the runtime PM callbacks in the driver, when they are not > > > intended to be called. In the worst case, I guess we could even end up > > > trying to control resources (like a clock) from the ->runtime > > > _resume() callback, when the references to these resources may already > > > have been released. > > > > This is all about what we consider remove. To me, with devm_ manged cleanup > > in place, both remove() and devm_ cleanup count as parts of that remove > > process. > > There is no straightforward process here, if you would keep runtime PM > enabled beyond ->remove(). Things can happen in parallel. How? Nothing magic happens when a driver remove() ends. So there is no difference at all in a driver calling runtime pm disable in that code or in devres cleanup that happens immediately after that in the bus driver remove. > > In that case, drivers would need to extend their runtime PM callbacks > to cope with more complicated conditions, as resources that those use > may have been released. Moreover, how can we make sure that the device > is put into a low power state after the ->remove() has been called? This is a fair question. Also one commonly handled by drivers using devm though perhaps not for the reason you are thinking. Key is that most drivers should not rely at all on runtime PM being built let alone enabled. Solution is normally a devm_add_action_or_reset() call that registers a call to put the device into a low power state. Sequence is normally something like: 1) enable clks etc. 2) Turn the power on. 3) device setup 4) enable runtime PM after setting the state to active. Let autosuspend do it's work. On remove (all devm_ managed). 5) disable runtime PM. 6) Turn the power off (may involve a check on whether it is already off though in many cases it's idempotent so we don't bother checking). 7) disable clocks. in this particular driver I'm assuming that the low power state is handled via the reset lines being put back into reset by the the unwind of the two devm_reset_control_get_exclusive_deasserted() calls. > > > > > One option I did wonder about was having a devm_pm_domain_attach() > > A little cheeky but I think the call in platform_probe() is late enough > > that we don't run into the checks on no devm_ calls before driver probe. > > > > That would shuffle the dev_pm_domain_detach() to the end of the > > devm_ cleanup. Mind you the i2c approach above seems better. > > Again, this isn't limited to PM domains. All comes down to when you think driver remove is finished. I think the answer to that in the above i2c scheme is after the release of the devres group, not after end of remove call. Anything that the bus driver does will then occur after the device driver devres cleanup is done. That puts all cleanup in device driver remove() immediately before the devres cleanup of anything setup in probe(), making manual unwinding and automatic unwinding effective the same. So for me the right fix here is to make the same thing happen for platform devices as for i2c ones. Jonathan > > [...] > > Kind regards > Uffe