01.10.2021 17:55, Ulf Hansson пишет: > On Fri, 1 Oct 2021 at 16:29, Dmitry Osipenko <digetx@xxxxxxxxx> wrote: >> >> 01.10.2021 16:39, Ulf Hansson пишет: >>> On Mon, 27 Sept 2021 at 00:42, Dmitry Osipenko <digetx@xxxxxxxxx> wrote: >>>> >>>> Add runtime power management and support generic power domains. >>>> >>>> Tested-by: Peter Geis <pgwipeout@xxxxxxxxx> # Ouya T30 >>>> Tested-by: Paul Fertser <fercerpav@xxxxxxxxx> # PAZ00 T20 >>>> Tested-by: Nicolas Chauvet <kwizart@xxxxxxxxx> # PAZ00 T20 and TK1 T124 >>>> Tested-by: Matt Merhar <mattmerhar@xxxxxxxxxxxxxx> # Ouya T30 >>>> Signed-off-by: Dmitry Osipenko <digetx@xxxxxxxxx> >>>> --- >>>> drivers/gpu/drm/tegra/gr2d.c | 155 +++++++++++++++++++++++++++++++++-- >>> >>> [...] >>> >>>> static int gr2d_remove(struct platform_device *pdev) >>>> @@ -259,15 +312,101 @@ static int gr2d_remove(struct platform_device *pdev) >>>> return err; >>>> } >>>> >>>> + pm_runtime_dont_use_autosuspend(&pdev->dev); >>>> + pm_runtime_disable(&pdev->dev); >>> >>> There is no guarantee that the ->runtime_suspend() has been invoked >>> here, which means that clock may be left prepared/enabled beyond this >>> point. >>> >>> I suggest you call pm_runtime_force_suspend(), instead of >>> pm_runtime_disable(), to make sure that gets done. >> >> The pm_runtime_disable() performs the final synchronization, please see [1]. >> >> [1] >> https://elixir.bootlin.com/linux/v5.15-rc3/source/drivers/base/power/runtime.c#L1412 > > pm_runtime_disable() end up calling _pm_runtime_barrier(), which calls > cancel_work_sync() if dev->power.request_pending has been set. > > If the work that was punted to the pm_wq in rpm_idle() has not been > started yet, we end up just canceling it. In other words, there are no > guarantees it runs to completion. You're right. Although, in a case of this particular patch, the syncing is actually implicitly done by pm_runtime_dont_use_autosuspend(). But for drivers which don't use auto-suspend, there is no sync. This looks like a disaster, it's a very common pattern for drivers to 'put+disable'. > Moreover, use space may have bumped the usage count via sysfs for the > device (pm_runtime_forbid()) to keep the device runtime resumed. Right, this is also a disaster in a case of driver removal. >> Calling pm_runtime_force_suspend() isn't correct because each 'enable' >> must have the corresponding 'disable'. Hence there is no problem here. > > pm_runtime_force_suspend() calls pm_runtime_disable(), so I think that > should be fine. No? [adding Rafael] Rafael, could you please explain how drivers are supposed to properly suspend and disable RPM to cut off power and reset state that was altered by the driver's resume callback? What we're missing? Is Ulf's suggestion acceptable? The RPM state of a device is getting reset on driver's removal, hence all refcounts that were bumped by the rpm-resume callback of the device driver will be screwed up if device is kept resumed after removal. I just verified that it's true in practice.