Re: [PATCHv3 8/9] ARM: OMAP2+: AM33XX: Basic suspend resume support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/08/2013 06:04 PM, Kevin Hilman wrote:
Nishanth Menon <nm@xxxxxx> writes:

On 08/08/2013 04:14 PM, Kevin Hilman wrote:
Dave Gerlach <d-gerlach@xxxxxx> writes:

On 08/08/2013 10:03 AM, Santosh Shilimkar wrote:
$subject and patch don't match.

On Thursday 08 August 2013 08:26 AM, Nishanth Menon wrote:
On 08/08/2013 03:45 AM, Russ Dill wrote:
     In reference to
the M3 handling it, the M3 wouldn't know which devices have a driver
bound and which don't.
Does it need to? M3 firmware can pretty much define "I will force
the device into low power state, and if the drivers dont handle
things properly, fix the darned driver". M3 behavior should be
considered as a "hardware" as far as Linux running on MPU is
concerned, and firmware helps change the behavior by accounting for
SoC quirks. *if* we have ability to handle this in the firmware,
there is no need to carry this in Linux.

I agree with Nishant. I don't like this patch and IIRC, I gave same
comment in the last version. Linux need not know about all such firmware
quirks. Also all these M3 specific stuff, should be done somewhere
else. Probably having a small M3 driver won't be a bad idea.

I am not opposed to doing it this way and letting the M3 firmware
handle idling these modules, however the one concern raised in the
last series is that an approach that does not acknowledge drivers will
hide driver PM bugs. I suppose as long as I make sure to document that
the devices are being idled by the M3 firmware this may not be an
issue. I will look into implementing this.

No, please don't start idling devices in firmware that are otherwise
managed by Linux.  Keep the firmware simple and dumb.  Linux is managing
these devices, it should manage their bugs too.


This is not just about idling devices.  This is about handling broken IP
blocks whose power-on reset state does not allow the the powerdomain to
reach its target state.  That's just bad hardware design.

Right, this is where M3 can help -> provide a consistent state for
linux kernel to work with. by the fact that we want to keep majority
of the power code inside master CPU, we are just letting M3 help us
with nothing major at all..

heh, I would say HW design bugs like this are more than "nothing major
at all." :)

tiny stuff like these can help "fix" the hardware design quirks by
hiding it behind the firmware and modifying the hardware behavior.

I disagree here.  I'm a firmware minimalist, and hiding bugs like this
in the firmware is wrong when Linux is otherwise managing these devices.
It also imposes criteria on the firmware of future SoCs that doesn't
belong there either.  IMO, the only stuff the firmware should do is what
Linux *cannot* do.

Remember, this only needs to happen when there isn't a driver for these
devices.  Should we communicate to the firmware that the OS has no
driver, so please enable the hack?  I think not.

My view is that the M3 should *ignore* the presence/existence of MPU's drivers. M3 will do whatever to force the system to go to suspend once notified - this saves us the prehistoric perpetual trouble when drivers have bugs (which get exposed in weird usage scenarios) in production systems, we dont get any hardware help to fix them up while attempting low power states and system never really hits low power state. This was always because OMAP and it's derivatives have been "democratic" in power management - if every hardware block achieves proper state, then we achieve a system-wide low power state.


I know it breaks the purity of role, but as the
next evolution, we might want to consider M3 something like an
"accelerator" for power management activity.. (not saying it is that
fast.. but conceptually).

Yes, it breaks the purity of role, and makes it hard to maintain and
extend to future SoCs.  As a maintainer, that's a red flag.  IMO, the
roles need to be kept clear.  The M3 manages some devices and the
interconnect that MPU/Linux cannot, the rest are managed by Linux.

suspend is a very controlled state as against cpuidle where driver knowledge is necessary and in fact mandatory. drivers are supposed to release their resources - and even though we test the hell out of them, we do have paths untrodden when it comes to production systems.

I think the insight we have about the hardware make us(linux folks) want to own the decision making process on the master MPU - I mean, *nobody*(including me) wants to trust a "firmware" - that word is almost synonymous with "unspeakable horror".

If on the other hand, we had a non-programmable hardware which would force all systems to achieve off mode (imagine having a PRCM which was really capable of doing it), we would have probably not had to deal with those pesky "stuck-in-transition" and other variants of issues (where MPU went to low power state, but core refused to go down - resulting in 200mA+ power instead of the <1mA we expected to see).

I consider M3 to power management similar to what Neon is to ARM. I mean, I would even love a PMIC which is completely reprogrammable (where I could define the registers in s/w)!

My personal thought is that (if possible):
a) we should try to make the source firmware visible to everyone who has a stake on it. b) If (a) is possible, then we should see how we can consider M3 as an extension to Linux power strategy, rather than a "necessary burden" to carry around.

In this particular case. (a) is done see [1]. So, why not (b)? A synergy does not necessarily mean "purity of role" is broken. it is just another way of doing the job.

While, I personally dont think [1] is public enough, we can try to work through those current constraints to ensure everything is synergistic.

in other words, this is not a "Graphics" or "Multimedia" or even few "BIOS" kind of "hidden firmware you cannot do anything about" scenario - here, *we* have the choice.

[1] http://arago-project.org/git/projects/?p=am33x-cm3.git;a=summary

That being said, IMO, the kernel (specifically omap_device) should
handle this, and it should be rather easy to do in the omap_device layer
and keep the SoC suspend/resume core code simple and ignorant of these
"quirks."

AFAICT, there's no reason these quirks need to be dealt with immediatly
on suspend.  A slight delay should be fine, as long as it's before the
next suspend/idle attempt, right?

Given that, what we need to do (and by we, I mean you) is to flag all
broken IP blocks, and let omap_device handle them in a suspend/resume
notifier (c.f. register_pm_notifier() and PM_POST_SUSPEND.)

yes - that is the alternate that comes to mind.

In the earlier reviews of this series (many months ago now), I
complained about the presence of this device specific handling in the
core MPU PM code.  I'm somewhat troubled by the fact that nobody explored
alternatives that so easily come to mind.

Just spoke to Dave in person a few mins back, and he is going to go through all the previous mail chains and attempt to be thorough again - seems like going through a written list of pending actions completely missed many key aspects of prior reviews :). Apologies on this.


--
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux