Quoting Johan Hovold (2022-11-03 07:31:09) > On Wed, Nov 02, 2022 at 10:07:17AM -0700, Stephen Boyd wrote: > > We shouldn't be calling runtime PM APIs from within the genpd > > enable/disable path for a couple reasons. > > > > First, this causes an AA lockdep splat because genpd can call into genpd > > code again while holding the genpd lock. > > > > WARNING: possible recursive locking detected > > > Second, this confuses runtime PM on CoachZ for the camera devices by > > causing the camera clock controller's runtime PM usage_count to go > > negative after resuming from suspend. This is because runtime PM is > > being used on the clock controller while runtime PM is disabled for the > > device. > > > > The reason for the negative count is because a GDSC is represented as a > > genpd and each genpd that is attached to a device is resumed during the > > noirq phase of system wide suspend/resume (see the noirq suspend ops > > assignment in pm_genpd_init() for more details). The camera GDSCs are > > attached to camera devices with the 'power-domains' property in DT. > > Every device has runtime PM disabled in the late system suspend phase > > via __device_suspend_late(). Runtime PM is not usable until runtime PM > > is enabled in device_resume_early(). The noirq phases run after the > > 'late' and before the 'early' phase of suspend/resume. When the genpds > > are resumed in genpd_resume_noirq(), we call down into gdsc_enable() > > that calls pm_runtime_resume_and_get() and that returns -EACCES to > > indicate failure to resume because runtime PM is disabled for all > > devices. > > Probably worth mentioning the fact that those runtime PM calls > unconditionally failing during resume means that the GDSCs are never > even enabled. > > Seems like the PM runtime usage counters would still be balanced after > this though as they are decremented also on failure during suspend (i.e. > domain remains off and no usage counter is incremented during resume). > I'm seeing negative usage counts. > But this is clearly just very broken. > > > Upon closer inspection, calling runtime PM APIs like this in the GDSC > > driver doesn't make sense. It was intended to make sure the GDSC for the > > clock controller providing other GDSCs was enabled, specifically the > > MMCX GDSC for the display clk controller on SM8250 (sm8250-dispcc), so > > that GDSC register accesses succeeded. That will already happen because > > we make the 'dev->pm_domain' a parent domain of each GDSC we register in > > gdsc_register() via pm_genpd_add_subdomain(). When any of these GDSCs > > are accessed, we'll enable the parent domain (in this specific case > > MMCX). > > > > We also remove any getting of runtime PM during registration, because > > when a genpd is registered it increments the count on the parent if the > > genpd itself is already enabled. And finally, the runtime PM state of > > the clk controller registering the GDSC shouldn't matter to the > > subdomain setup. Therefore we always assign 'dev' unconditionally so > > when GDSCs are removed we properly unlink the GDSC from the clk > > controller's pm_domain. > > This last bit makes no sense as 'dev' was only used for the runtime PM > management and should be removed by this patch. Oh sheesh, the name 'gdsc_register()' really throws me off. I think it's one gdsc being registered, but it's actually plural gdscs, sigh. I will fix it.