Re: [RFC/PATCH] clk: samsung: exynos5433: Fix NOIRQ suspend/resume support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 20, 2018 at 3:42 PM Marek Szyprowski
<m.szyprowski@xxxxxxxxxxx> wrote:
>
> Hi Ulf,
>
> On 2018-09-19 23:18, Ulf Hansson wrote:
> > On 19 September 2018 at 14:13, Marek Szyprowski
> > <m.szyprowski@xxxxxxxxxxx> wrote:
> >> My recent commit af68ec14f112 'clk: samsung: Use NOIRQ stage for Exynos5433
> >> clocks suspend/resume' moved Exynos5433 CMU suspend/resume callbacks
> >> to NOIRQ stage. It turned out that in some cases this is not enough to
> >> ensure that clocks provided by the given CMU will be available for other
> >> drivers during their NOIRQ system suspend/resume phase. Device core calls
> >> pm_runtime_disable and pm_runtime_enable unconditionally on every device
> >> during LATE system suspend/resume phase. This means that during
> >> noirq_resume, CMU device cannot be runtime resumed, because its
> >> power.disable_depth is higher than zero (runtime PM for it will be enabled
> >> later in device_resume_early function).
> >>
> >> To let other devices properly use clocks and keep runtime PM enabled also
> >> during NOIRQ system sleep phase, add calls to pm_runtime_enable/disable
> >> to balance the actions done by device core. The CMU device will be
> >> suspended in its noirq_suspend callback and finally resumed in noirq_resume.
> >
> >> Signed-off-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> >> ---
> >> Hi!
> >>
> >> The unconditional calls to pm_runtime_enable and pm_runtime_enable
> >> in device_suspend_late and device_resume_early core functions were a bit
> >> surprising for me, because they make NOIRQ phase a special case from
> >> the perspective of runtime PM handling. Using LATE system sleep phase
> >> to call pm_runtime_enable/disable looks like a workaround for me, but right
> >> now I have no other idea how to ensure proper behavior of the Exynos5433
> >> CMU driver.
> >>
> >> The most strange thing is that I didn't observe this initially after
> >> switching to NOIRQ phase. The CMU device (FSYS_CMU) which is not under
> >> any power domain behaved correctly and was properly available for DWMMC
> >> device, which needs to enable clocks in its noirq_resume callback.
> >>
> >> However, in case of AUD_CMU and its clients (for example serial_3 device,
> >> which is a part of audio-subsytem), the pm_runtime_get_sync() called from
> >> clk_prepare() failed, because AUD_CMU device had still disabled runtime PM
> >> support. The only difference between FSYS_CMU and AUD_CMU, is the fact
> >> that AUD_CMU is under AUD power domain, while FSYS_CMU device has no power
> >> domain assigned (for completely other reasons FSYS power domains is not yet
> >> instantiated in exynos5433.dtsi).
> >>
> >> Ulf/Rafael: could You comment a bit on this issue? How it should be solved
> >> properly?
> > In general, this sounds like the classical problem of
> > suspending/resuming devices in a correct order. The long term solution
> > is to use device links to address this problem.
>
> Device links won't help in this case. This is not a problem of correct
> order of suspending/resuming devices. exynos5433 cmu device is already
> suspended after its client's and resumed before them. The problem here
> is using runtime PM in exynos5433 CMU driver during NOIRQ suspend/resume
> stage.

Simply put, if a driver uses PM-runtime in its noirq system-wide
callbacks, this is a bug in the driver.

> > However, as a simple fix, I would try to add a ->prepare() callback
> > and call pm_runtime_get_sync() from it and then in a new corresponding
> > ->complete() callback, call pm_runtime_put().
> >
> > This means that the device will stay runtime resumed until the
> > suspend_noirq phase. It would also means the pm_runtime_force_resume()
> > will resume the device at resume_noirq phase. Would that work?
>
> Yes, this will work, but it will unnecessarily wake all instances of
> exynos5433 cmu and their power domains during the system suspend/resume
> cycle, what I would like to avoid.
>
> Are there any reasons for disabling runtime PM unconditionally for every
> device in the system during LATE system sleep stage?

Yes, there are.

It generally is not safe to do that, because the state of the device
after/during late suspend generally may be different than expected by
its PM-runtime callbacks and that doesn't apply just to the device
itself, but also to its parent and possible suppliers and all of their
ancestors and suppliers.  The resume of a device generally triggers a
resume of all devices it depends on and that may not be expected by
their drivers and the bus types/PM domains they depend on and/or work
with.

Your particular case appears to be simple enough, but there are more
complicated ones.

Also, it generally is not a good idea to rely on PM-runtime for
system-wide PM, because PM-runtime may be effectively disabled by user
space via sysfs for any device.  IOW, pm_runtime_suspend() or
pm_runtime_put_sync() or equivalent doesn't guarantee that the device
will actually be suspended and system-wide PM is expected to work
regardless.

So, please abandon the
runtime-resume-during-noirq-phase-of-system-wide-suspend idea and find
a different way to organize your code.

Thanks,
Rafael



[Index of Archives]     [Linux SoC Development]     [Linux Rockchip Development]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Linux SCSI]     [Yosemite News]

  Powered by Linux