Quoting Doug Anderson (2024-03-25 09:19:37) > Hi, > > On Sun, Mar 24, 2024 at 10:44 PM Stephen Boyd <sboyd@xxxxxxxxxx> wrote: > > > > Introduce a list of clk_core structures that have been registered, or > > are in the process of being registered, that require runtime PM to > > operate. Iterate this list and call clk_pm_runtime_get() on each of them > > without holding the prepare_lock during clk_disable_unused(). This way > > we can be certain that the runtime PM state of the devices will be > > active and resumed so we can't schedule away while walking the clk tree > > with the prepare_lock held. Similarly, call clk_pm_runtime_put() without > > the prepare_lock held to properly drop the runtime PM reference. > > There's a part of me that worries about the fact that we'll now be > doing a pm_runtime get() on _all clocks_ (even those that are used) at > bootup now. I worry that some device out there will be unhappy about > it. ...but I guess the device passed in here is already documented to > be one that the clock framework can get/put whenever it needs to > prepare the clock, so that makes me feel like it should be fine. > > Anyway, no action item, just documenting my thoughts... > > Oh, funny. After reading the next patch, I guess I'm even less > concerned. I guess we were already grabbing the pm_runtime state for > all clocks while printing the clock summary. While that's a debugfs > function, it's still something that many people have likely exercised > and it's likely not going to introduce random/long tail problems. > > > > +/* > > + * Call clk_pm_runtime_get() on all runtime PM enabled clks in the clk tree so > > + * that disabling unused clks avoids a deadlock where a device is runtime PM > > + * resuming/suspending and the runtime PM callback is trying to grab the > > + * prepare_lock for something like clk_prepare_enable() while > > + * clk_disable_unused_subtree() holds the prepare_lock and is trying to runtime > > + * PM resume/suspend the device as well. > > + */ > > +static int clk_pm_runtime_get_all(void) > > nit: It'd be nice if this documented that it acquired / held the lock. > Could be in comments, or, might as well use the syntax like this (I > think): > > __acquires(&clk_rpm_list_lock); > > ...similar with the put function. I had that but removed it because on the error path we drop the lock and sparse complains. I don't know how to signal that the lock is held unless an error happens, but I'm a little out of date on sparse now. > > > > + /* > > + * Runtime PM "get" all the devices that are needed for the clks > > + * currently registered. Do this without holding the prepare_lock, to > > + * avoid the deadlock. > > + */ > > + hlist_for_each_entry(core, &clk_rpm_list, rpm_node) { > > + ret = clk_pm_runtime_get(core); > > + if (ret) { > > + failed = core; > > + pr_err("clk: Failed to runtime PM get '%s' for clk '%s'\n", > > + failed->name, dev_name(failed->dev)); > > If I'm reading this correctly, the strings are backward in your error > print. Right now you're printing: > > clk: Failed to runtime PM get '<clk_name>' for clk '<dev_name>' Good catch. Thanks! > > With the printout fixed and some type of documentation that > clk_pm_runtime_get_all() and clk_pm_runtime_put_all() grab/release the > mutex: > > Reviewed-by: Douglas Anderson <dianders@xxxxxxxxxxxx>