"Rafael J. Wysocki" <rjw@xxxxxxx> writes: > On Friday, October 08, 2010, Kevin Hilman wrote: >> "Rafael J. Wysocki" <rjw@xxxxxxx> writes: >> >> > On Thursday, October 07, 2010, Alan Stern wrote: >> >> On Thu, 7 Oct 2010, Kevin Hilman wrote: >> >> >> >> > My confusion is not about the use of spinlocks, it's a question of what >> >> > is being busy-waited for, and the thread that is being waited for is >> >> > going to complete when interrupts are disabled. >> >> > >> >> > Sorry to be dense, but can you (re)summarize what you're proposing as I >> >> > think I'm getting mixed up with all the various options we've been >> >> > tossing around. >> >> > >> >> > If it can work, I'm certainly in favor of a busy-wait approach as it >> >> > really ensures that sync requests are handled quickly. >> >> >> >> Okay, here's the story in a nutshell. Allowing a subsystem's or >> >> driver's runtime-PM callbacks to run with interrupts disabled faces two >> >> obstacles: >> >> >> >> (1): We don't want two different CPUs to run callbacks for the >> >> same device at the same time. So if a callback is already >> >> running on one CPU (i.e., if the device's runtime status is >> >> either SUSPENDING or RESUMING) then another CPU can't be >> >> allowed to invoke a callback. >> >> >> >> Thus, you can't do a synchronous pm_runtime_resume_irq() >> >> if the device is in the middle of a suspend or resume >> >> operation. We're left with two choices: Fail the synchronous >> >> call and force the driver to defer matters to a workqueue >> >> (possibly masking an IRQ line in the meantime), or busy-wait >> >> until the concurrent operation finishes. >> >> >> >> If the PM core simply avoids releasing dev->power.lock before >> >> invoking the runtime_suspend or runtime_resume callback, the >> >> end result is almost the same as with busy-waiting. >> > >> > This is slightly more complicated, because suspend is a bit different from >> > resume. I think it's generally acceptable to refuse to do a "fast path" suspend >> > if the device state is not RPM_ACTIVE at the moment, but refusing to do >> > a "fast path" resume may be more problematic (more on that later). >> > >> >> (2): In general we can't resume a device if its parent is suspended. >> >> If the parent's runtime_resume routine needs to run with >> >> interrupts enabled then there's no way to resume the device >> >> while keeping interrupts disabled. >> >> >> >> Possible solutions involve, again, deferring matters to a >> >> workqueue, or else simply not allowing the situation to arise >> >> in the first place (forbid a device to have interrupt-disabled >> >> callbacks unless its parent does too or the parent doesn't use >> >> runtime PM at all). >> >> >> >> In general I'm against the solutions that require a workqueue. >> > >> > OK >> > >> >> Raphael appears to favor workqueues for (1) and be against them for (2). >> > >> > The particular case we need to handle (I think) is that some devices need to be >> > resumed "atomically" as a part of bringing a CPU package out of a C-like-state >> > (Kevin, is that correct?). >> >> Correct, but there is more than one problematic case. >> >> Another problem is in devices that are not atomic with C-states, but >> receive wakeup IRQs while they're suspended and thus need to >> pm_runtime_get() from their ISRs, without having an essentially >> unbounded interrupt latency before the ISR can be handled. >> >> Consider the following sequence for a given device >> >> - device: ->runtime_suspend() /* registers not accessible */ >> - pm_idle >> - IRQs disabled >> - idle >> - wakeup event >> - idle loop finishes >> - IRQs enabled >> - device: ->runtime_resume() >> >> Now, consider the 'wakeup event' is an IRQ for that device. That means >> as soon as IRQs are (re)enabled (and before some other activity has >> triggered a ->runtime_resume), the device's ISR is called. Since it is >> RPM_SUSPENDED, it's registers are not accessible so it needs to do a >> pm_runtime_get(). >> >> The case that raised this initially was a GPIO IRQ demux, where the >> initial ISR is just a chained handler whichfigures out which GPIO >> fired and then call genirq for that IRQ. >> >> All that being said, while I've currently described these as two >> different problems, the second one could be solved by converting it to >> the first one. IOW, make GPIO be one of those devices that are >> suspended/resumed atomically with the CPU so that it is guaranteed to be >> awake when IRQs are re-enabled. >> >> While that would work, I've been trying to move in the opposite >> direction by trying to dis-connect them from CPU idle. >> >> > In that case not only we can't defer the resume (by >> > using a workqueue), but also we have to finish the resume within strict time >> > constraints (the time to leave the given C-like-state is one of the parameters >> > used by cpuidle governors to decide which state to choose). >> > >> > On the other hand, busy waiting (by looping) in the case the state is >> > RPM_RESUMING is as though the callbacks were executed under a spinlock and we >> > happened to wait on it. I'd prefer to avoid that. Moreover, if the state is >> > RPM_SUSPENDING at the time a "fast path" resume is started, we are almost >> > guaranteed to violate the time constraints (by busy waiting for the suspend to >> > complete and carrying out the resume). >> > >> > So, here's an idea: >> > >> > First, let's introduce two flags in struct dev_pm_info, irq_safe and >> > fast_resume. The former has to be set so that the things above work. >> > >> > Second, let's add a new flag to pass to __pm_runtime_{suspend|resume}(), >> > RPM_FAST_RESUME such that if it is passed to __pm_runtime_suspend(), it >> > causes fast_resume to be set for the given device (if the status is already >> > RPM_SUSPENDED, it just sets the flag, otherwise it suspends the device >> > normally and then sets the flag). Now, if fast_resume is set, >> > __pm_runtime_resume() has to be passed RPM_FAST_RESUME, or it will >> > fail (it also will fail if passed RPM_FAST_RESUME and fast_resume isn't >> > set). In that case the status has to be RPM_SUSPENDED or RPM_RESUMING >> > (otherwise fast_resume won't be set) and if it is RPM_RESUMING, this means >> > that the other resume has been called with RPM_FAST_RESUME too (it would >> > fail otherwise), so it's fine to busy loop until the status is RPM_SUSPENDED >> > (it's the caller's problem to avoid that). RPM_FAST_RESUME causes >> > __pm_runtime_resume() to avod turning interrupts on. >> > >> > If necessary, there may be a flag for __pm_runtime_suspend() that will >> > carry out "atomic" suspend, but I'm not sure if that's necessary. Kevin? >> >> I think we'll need both "atomic" suspend & resume. >> >> One of the devices I'm currently managing in this atomic fashion (inside >> pm_idle) is the UARTs (of course, I'm not currently using runtime PM for >> this, I'm calling platform-specific hooks directly since IRQs are >> disabled.) For the UARTs I'm using both atomic suspend and resume, >> primarily for debug so I am sure UARTs are awake to see any panics >> during the last parts of the suspend/idle process and the early parts of >> the resume path. This UART case is a bit of a hack until the driver is >> converted to runtime PM, but illustrates some debug related reasons we >> might also need atomic suspend and resume. >> >> We also have some workarounds for hardware errata that require >> putting certain devices in a specific state just before (and after) >> idle. The drivers for these devices will need both suspend and resume. > > OK, thanks for the info. > > Do you need "normal" resume to work after "atomic" suspend, or is it > sufficient that "atomic" suspend will require "atomic" resume? hmm... while I'm definitely needing an "atomic" resume after a "normal" suspend, for now I can't think of a case where a "normal" resume would be needed after an "atomic" suspend. All the cases where I'm currently using an atomic suspend also have a corresponding atomic resume. As I write this, it wouldn't surprise me down the road to find some HW errata that requires the device in a specific state only before idle, but not caring about the state after idle. That would be a case where an atomic suspend would be needed, but the resume would be "normal" sometime later when the device is next needed. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html