Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?

Paul Walmsley <paul@xxxxxxxxx> · Fri, 27 Jan 2012 15:58:40 -0700 (MST)

On Sat, 28 Jan 2012, NeilBrown wrote:

> On Thu, 26 Jan 2012 07:19:19 -0700 (MST) Paul Walmsley <paul@xxxxxxxxx> wrote:
>
> > Thanks.  Should I add a Tested-by: from you on those patches?
> 
> Well... I only tested them after backporting to 3.2, and Linus is a big
> proponent of the idea that once you rebase patches you invalidate any
> testing....
> However I don't think there difference between 3.2 and 3.3-rc1 is big enough
> in the area, so yes:
> 
>    Tested-by: NeilBrown <neilb@xxxxxxx>

Okay, thanks.

> > Here's a theory: perhaps the MPU powerdomain is hitting a low-power state 
> > while waiting for an HDQ interrupt.  When the MPU powerdomain is in a low 
> > power state, so is the MPU interrupt controller, so the only way that the 
> > MPU can wake up is if the HDQ can issue a wakeup event to the PRCM.  And I 
> > don't see any evidence that the HDQ is capable of doing this, based on the 
> > HDQ sections of the TRM.  What a huge energy waste, if true.
> 
> In the config I am testing MPU only goes to RET, never OFF. Same for CORE.

That qualifies as a low-power state for the MPU :-)

In MPU RET, at least in theory, the MPU INTC will not be operational, and 
thus unable to respond to interrupts.

> > Maybe try something like the following patch -- compile-tested only here.
> > 
> > If this works, you might want to try dropping this patch and using the pad 
> > mux to set a wakeup event on the 1-wire pad when the signal goes low.  
> > That might be a more power-efficient approach.  You may still have to use 
> > some PM QoS request there to ensure that the HDQ can wake up fast enough 
> > to see the pulse, but the constraint shouldn't need to be as ludicrously 
> > low as it is in the following patch.
> 
> Doesn't work - crashes :-(
> 
> pm_qos_power_init is defined as a late_initcall, so you cannot call
> pm_qos_update_request until after all the initcalls have run.
> But with your patch, the probe of the bq27000 triggers a read of the registers
> which tries to update_request - and it goes 'bang'.

Hmm, okay.  Could you please remove the register read from the HDQ probe 
and see if that makes a difference?

> I really think the problem is the CORE pwrdm gating a clock because no module
> says it needs it - i.e. nothing to do with MPU at all.

Until pm_runtime_put*() is called, the usecount of hdq_fck will still be 
non-zero.  So the CORE shouldn't be able to gate it or hdq_ick at that 
time, and thus should not be able to enter idle. Hence the question about 
where the problems occur: whether they occur in the middle of the 
transaction or when the HDQ clocks are disabled.

> We want to keep CORE active when an HDQ transaction is happening, but MPU is
> welcome to go to sleep.  I don't think you can express that with 'qos'.  I
> think it needs some omap-specific machinery.

The OMAP PRCM hardware should keep the CORE* clkdms active when the 
hdq_fck is enabled.  So it's possible there could be a PRCM silicon bug 
that doesn't take hdq_fck into account when determining whether the CORE_* 
clkdms are inactive.

> I can 'fix' the problem simply by making sure
> 
> 		pwrdm_for_each_clkdm(core_pd, _cpuidle_deny_idle);
> 
> runs in omap3_enter_idle whenever HDQ is active.

Hmm, that does suggest that it's not wakeup related.

> One of the reasons that I think it is a clock problem rather than just 
> missing a wakeup event is that once the problem starts happening I 
> cannot recovery without rebooting. i.e. even if I tell the UARTs to keep 
> the clocks on permanently and keep the CPUIDLE state at 0, the HDQ 
> doesn't start working again.  It has clearly become confused. The HDQ 
> doco makes a point of saying that you shouldn't issue any commands 
> (except 'enable clock') when the clock is disabled.  I think we end up 
> doing that and it gets confused and cannot recover.
> 
> I note that there is an ad-hoc dependency between the camera and various 
> power states as well.  Maybe we need a little bit of infrastructure so 
> that camera can say "Keep CORE and MPU on" (or whatever it needs) and 
> HDQ can say "Keep CORE on".  ???

I'm not familiar with the camera problems, but in the HDQ case, this 
should only be needed if a silicon bug exists.  Which is certainly 
possible; we've seen this problem with one other IP block in the past. 
Based on a quick glance at the errata, I don't see anything related to the 
HDQ, but that doesn't really mean anything.

In any case, we should be able to work around this via the hwmod layer and 
a special flag, if the problem really is a PRCM bug.  This will depend on 
the functional powerstate conversion.  

Thanks for the detailed test reports.

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html