RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Kevin Hilman [mailto:khilman@xxxxxxxxxxxxxxxxxxx]
> Sent: Thursday, August 05, 2010 3:17 AM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@xxxxxxxxxxxxxxx; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Basak, Partha" <p-basak2@xxxxxx> writes:
> 
> >> -----Original Message-----
> >> From: Kevin Hilman [mailto:khilman@xxxxxxxxxxxxxxxxxxx]
> >> Sent: Tuesday, August 03, 2010 11:06 PM
> >> To: Basak, Partha
> >> Cc: Paul Walmsley; linux-omap@xxxxxxxxxxxxxxx; Kalliguddi, Hema; Raja,
> >> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> >> Subject: Re: Issues with calling pm_runtime functions in
> >> platform_pm_suspend_noirq/IRQ disabled context.
> >>
> >> "Basak, Partha" <p-basak2@xxxxxx> writes:
> >>
> >> > Resending with the corrected mailing list
> >> >
> >> > Hello Kevin,
> >> >
> >> > I want to draw your attention to an issue the following patch
> >> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
> >> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
> >>
> >> [...]
> >>
> >> > I believe, it is not correct to call the pm_runtime APIs from
> >> > interrupt locked context since the underlying functions
> >> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
> >> > schedule().
> >> >
> >> > Further, from a s/w layering standpoint, platform-bus is a deeper
> >> > layer than the Runtime layer, which the runtime layer calls into. So
> >> > it may not be correct to have this layer calling into pm_runtime().
> It
> >> > should directly call omap_device APIs.
> >>
> >> The original version of this patch did directly call the omap_device
> >> APIs.  However, Paul didn't like that approach and there were
> >> use-counting problems to handle as well (e.g. if a driver was not (yet)
> >> in use, it would already be disabled, and a suspend would trigger an
> >> omap_device_disable() which would fail since device was already
> >> disabled.)
> >>
> >> Paul and I agreed that this current approach would solve the
> >> use-counting problems, but you're correct in that it introduces
> >> new problems as these functions can _possibly_ sleep/schedule.
> >>
> >> That being said, have you actually seen problems here?   I would like
> >> to see how they are triggered?
> >
> > Scenario 1:
> >
> > Consider the case where a driver has already called
> > pm_runtime_put_sync as part of aggressive clock cutting
> > scheme. Subsequently, when system suspend is called,
> > pm_runtime_put_sync is called again.
> 
> > Due to atomic_dec_and_test check
> > on usage_count, the second call goes through w/o error.
> 
> I don't think you're fully understanding the point of the put/get in the
> suspend/resume path.
> 
> The reason for the 'put' in the suspend path is because the PM core has
> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that
> runtime PM transitions are prevented during suspend/resume.
> 
> On OMAP, we want to allow those transitions to happen so we can use the
> same runtime PM calls in the idle and suspend paths.
> 
> > At System resume, pm_runtime_get_sync is called twice, once by resume,
> > once by the driver itself.
> 
> Yes, but there is a 'put_sync' in between those done by the PM core
> during resume (c.f. dpm_complete() in drivers/base/power/main.c]
> 
> > Because of the extra reference count, the
> > aggressive clock control by the driver is broken, as call to put_sync
> > by a driver reduces to usage_count to 1. As a result clock transition
> > in Idle-path is broken.
> 
I understand the rationale better, but having these changes is making the next Idle call after a suspend-resume cycle to fail. I will continue debugging on this and come back with more details.

> 
> > Scenario2:
> >
> > Tested GPIO for Suspend with these changes & obtained dump of Circular
> Locking dependencies:
> 
> I don't think these to problems are related, AFAICT, they are separate
> issues.  I'll respond to this scenario in a different reply.
> 
> >>
> >> [...]
> >>
> >> The UART driver is a special case as it is managed from deep inside the
> >> idle path.
> >>
> >> > Can you feedback on my observations and comment on the approach taken
> >> > for these drivers?
> >>
> >> I cannot comment until I understand why these drivers need to
> >> enable/disable with interrupts off.
> >>
> >> In general, any use of the interrupts-off versions of the omap_device
> >> APIs need to be thorougly justified.
> >>
> >> Even the UART one will go away when the omap-serial driver is converted
> >> to runtime PM.
> >>
> >
> > For GPIO, it is a must to relinquish clocks in Idle-path as it is
> > possible to have a GPIO bank requested and still allow PER domain to
> > go to OFF.
> 
> These the kind of things that I would expect to be discussed/described
> in the changelogs to patches posted to the list.
> 
> > For USB, this is required because of a h/w issue.
> 
> Again, patches with descriptive changelogs describing the "h/w issue"
> please.
> 
> > My point is, just by exposing a _omap_hwmod_idle()(mutex-free version)
> will not let us call pm_runtime APIs in Idle path as we are not supposed
> to enable interrupts there. We have faced problems with USB big-time in
> Idle path while using pm_runtime functions. So, we are resorting to
> calling omap_device_idle() in CPU_Idle path, and runtime functions in
> thread-context.
> >
> > Is this acceptable, given the limitations?
> 
> No, this is not (yet) acceptable.
> 
> The limitations you've come up against are not fully understood yet.
> Rather trying to hack around the limitations, we need to fully
> understand the root causes, which has not yet been done.
> 
> It's possible (and likely) that there are different ways to handle these
> problems, and entirely possible that there are locking issues we have to
> resolve in the omap_device/hwmod layers as well.
> 

GPIO:
The rationale for doing omap_device_idle/enable + context save/restore in the Idle path has been provided in the GPIO patch. We can continue the discussion on this topic on the individual submission lists of the drivers.

UART:
We will try to remove it during UART cleanup activity that will happen .

USB: Calling pm_runtime_get/put_sync in Idle path for USB is leading to IO-pad interrupts, therefore we are resorting to using omap_deviceIdle/Enable APIs. The MUSB patch will be posted shortly.
> Kevin

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux