On Wed, 13 Mar 2019 at 15:45, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > Hi Jiada, > > CC Marek, Jon, Mike, Stephen, linux-clk, linux-renesas-soc > > On Tue, Mar 12, 2019 at 7:51 AM Jiada Wang <jiada_wang@xxxxxxxxxx> wrote: > > Lockdep warns that prepare_lock and genpd->mlock can cause a deadlock > > the deadlock scenario is like following: > > First thread is probing cs2000 > > cs2000_probe() > > clk_register() > > __clk_core_init() > > clk_prepare_lock() ----> acquires prepare_lock > > cs2000_recalc_rate() > > i2c_smbus_read_byte_data() > > rcar_i2c_master_xfer() > > dma_request_chan() > > rcar_dmac_of_xlate() > > rcar_dmac_alloc_chan_resources() > > pm_runtime_get_sync() > > __pm_runtime_resume() > > rpm_resume() > > rpm_callback() > > genpd_runtime_resume() ----> acquires genpd->mlock > > > > Second thread is attaching any device to the same PM domain > > genpd_add_device() > > genpd_lock() ----> acquires genpd->mlock > > cpg_mssr_attach_dev() > > of_clk_get_from_provider() > > __of_clk_get_from_provider() > > __clk_create_clk() > > clk_prepare_lock() ----> acquires prepare_lock > > > > Since currently no PM provider access genpd's critical section > > in .attach_dev, and .detach_dev callbacks, so there is no need to protect > > these two callbacks with genpd->mlock. > > This patch avoids a potential deadlock by moving out .attach_dev and .detach_dev > > from genpd->mlock, so that genpd->mlock won't be held when prepare_lock is acquired > > in .attach_dev and .detach_dev > > > > Signed-off-by: Jiada Wang <jiada_wang@xxxxxxxxxx> > > Thanks a lot for your perseverance! > I had tried putting each PM Domain in its own lockdep class, but that > didn't help. > > Your patch fixes the lockdep warnings on my Salvator-X(S) boards. > > Tested-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > > Your solution makes sense, as the .{at,de}tach_dev() callbacks are > owned by the individual PM Domain drivers, and not by the genpd core. > > Reviewed-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > > One thing I'm still wondering about: genpd_add_device() is always called > with gpd_list_lock held. > However, the same cannot be said about genpd_remove_device(): > - pm_genpd_remove_device() does not take gpd_list_lock, unlike > pm_genpd_add_device(), > - genpd_dev_pm_detach() does not take gpd_list_lock, while > __genpd_dev_pm_attach() does take the lock for adding, > but not for removing (in the error patch)? That's correctly observed and it's something that needs to be fixed in the long run. Although I am not sure that we should use the gpd_list_lock... There are race conditions while removing a device at the same time as the genpd is removed. As a matter of fact, even the debugfs isn't correctly dropped, when a genpd is removed. However, because almost none it ever removing a genpd, this isn't a problem in practice. Anyway, I have a TODO list for genpd (starting to be quite longe) and this one is already on it, whatever that means. :-) [...] Kind regards Uffe