Bjorn, On Fri, Nov 4, 2022 at 6:57 AM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote: > > The pm_runtime usage in lpass-sc7280 was broken in quite a few > ways. Specifically: > > 1. At the end of probe it called "put" twice. This is a no-no and will > end us up with a negative usage count. Even worse than calling > "put" twice, it never called "get" once. Thus after bootup it could > be seen that the runtime usage of the devices managed by this > driver was -2. > 2. In some error cases it manually called pm_runtime_disable() even > though it had previously used devm_add_action_or_reset() to set > this up to be called automatically. This meant that in these error > cases we'd double-call pm_runtime_disable(). > 3. It forgot to call undo pm_runtime_use_autosuspend(), which can > sometimes have subtle problems (and the docs specifically mention > that you need to undo this function). > > Overall the above seriously calls into question how this driver is > working. It seems like a combination of "it doesn't", "by luck", and > "because of the weirdness of runtime_pm". Specifically I put a > printout to the serial console every time the runtime suspend/resume > was called for the two devices created by this driver (I wrapped the > pm_clk calls). When I had serial console enabled, I found that the > calls got resumed at bootup (when the clk core probed and before our > double-put) and then never touched again. That's no good. > [ 0.829997] DOUG: my_pm_clk_resume, usage=1 > [ 0.835487] DOUG: my_pm_clk_resume, usage=1 > > When I disabled serial console (speeding up boot), I got a different > pattern, which I guess (?) is better: > [ 0.089767] DOUG: my_pm_clk_resume, usage=1 > [ 0.090507] DOUG: my_pm_clk_resume, usage=1 > [ 0.151885] DOUG: my_pm_clk_suspend, usage=-2 > [ 0.151914] DOUG: my_pm_clk_suspend, usage=-2 > [ 1.825747] DOUG: my_pm_clk_resume, usage=-1 > [ 1.825774] DOUG: my_pm_clk_resume, usage=-1 > [ 1.888269] DOUG: my_pm_clk_suspend, usage=-2 > [ 1.888282] DOUG: my_pm_clk_suspend, usage=-2 > > These different patterns have to do with the fact that the core PM > Runtime code really isn't designed to be robust to negative usage > counts and sometimes may happen to stumble upon a behavior that > happens to "work". For instance, you can see that > __pm_runtime_suspend() will treat any non-zero value (including > negative numbers) as if the device is in use. > > In any case, let's fix the driver to be correct. We'll hold a > pm_runtime reference for the whole probe and then drop it (once!) at > the end. We'll get rid of manual pm_runtime_disable() calls in the > error handling. We'll also switch to devm_pm_runtime_enable(), which > magically handles undoing pm_runtime_use_autosuspend() as of commit > b4060db9251f ("PM: runtime: Have devm_pm_runtime_enable() handle > pm_runtime_dont_use_autosuspend()"). > > While we're at this, let's also use devm_pm_clk_create() instead of > rolling it ourselves. > > Note that the above changes make it obvious that > lpassaudio_create_pm_clks() was doing more than just creating > clocks. It was also setting up pm_runtime parameters. Let's rename it. > > All of these problems were found by code inspection. I started looking > at this driver because it was involved in a deadlock that I reported a > while ago [1]. Though I bisected the deadlock to commit 1b771839de05 > ("clk: qcom: gdsc: enable optional power domain support"), it was > never really clear why that patch affected it other than a luck of > timing changes. I'll also note that by fixing the timing (as done in > this change) we also seem to aboid the deadlock, which is a nice > benefit. > > Also note that some of the fixes here are much the same type of stuff > that Dmitry did in commit 72cfc73f4663 ("clk: qcom: use > devm_pm_runtime_enable and devm_pm_clk_create"), but I guess > lpassaudiocc-sc7280.c didn't exist then. > > [1] https://lore.kernel.org/r/20220922154354.2486595-1-dianders@xxxxxxxxxxxx > > Fixes: a9dd26639d05 ("clk: qcom: lpass: Add support for LPASS clock controller for SC7280") > Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx> > --- > > drivers/clk/qcom/lpassaudiocc-sc7280.c | 55 ++++++++++---------------- > 1 file changed, 21 insertions(+), 34 deletions(-) Is anything blocking this series from landing? I noticed a few other patches have landed since then to your Qualcomm clk branch, but I don't see these there. I assume it'll land through your tree. Thanks! -Doug