Hi, On Sun, Mar 24, 2024 at 10:44 PM Stephen Boyd <sboyd@xxxxxxxxxx> wrote: > > Introduce a list of clk_core structures that have been registered, or > are in the process of being registered, that require runtime PM to > operate. Iterate this list and call clk_pm_runtime_get() on each of them > without holding the prepare_lock during clk_disable_unused(). This way > we can be certain that the runtime PM state of the devices will be > active and resumed so we can't schedule away while walking the clk tree > with the prepare_lock held. Similarly, call clk_pm_runtime_put() without > the prepare_lock held to properly drop the runtime PM reference. There's a part of me that worries about the fact that we'll now be doing a pm_runtime get() on _all clocks_ (even those that are used) at bootup now. I worry that some device out there will be unhappy about it. ...but I guess the device passed in here is already documented to be one that the clock framework can get/put whenever it needs to prepare the clock, so that makes me feel like it should be fine. Anyway, no action item, just documenting my thoughts... Oh, funny. After reading the next patch, I guess I'm even less concerned. I guess we were already grabbing the pm_runtime state for all clocks while printing the clock summary. While that's a debugfs function, it's still something that many people have likely exercised and it's likely not going to introduce random/long tail problems. > +/* > + * Call clk_pm_runtime_get() on all runtime PM enabled clks in the clk tree so > + * that disabling unused clks avoids a deadlock where a device is runtime PM > + * resuming/suspending and the runtime PM callback is trying to grab the > + * prepare_lock for something like clk_prepare_enable() while > + * clk_disable_unused_subtree() holds the prepare_lock and is trying to runtime > + * PM resume/suspend the device as well. > + */ > +static int clk_pm_runtime_get_all(void) nit: It'd be nice if this documented that it acquired / held the lock. Could be in comments, or, might as well use the syntax like this (I think): __acquires(&clk_rpm_list_lock); ...similar with the put function. > + /* > + * Runtime PM "get" all the devices that are needed for the clks > + * currently registered. Do this without holding the prepare_lock, to > + * avoid the deadlock. > + */ > + hlist_for_each_entry(core, &clk_rpm_list, rpm_node) { > + ret = clk_pm_runtime_get(core); > + if (ret) { > + failed = core; > + pr_err("clk: Failed to runtime PM get '%s' for clk '%s'\n", > + failed->name, dev_name(failed->dev)); If I'm reading this correctly, the strings are backward in your error print. Right now you're printing: clk: Failed to runtime PM get '<clk_name>' for clk '<dev_name>' With the printout fixed and some type of documentation that clk_pm_runtime_get_all() and clk_pm_runtime_put_all() grab/release the mutex: Reviewed-by: Douglas Anderson <dianders@xxxxxxxxxxxx>