On 29/04/2021 07:46, Tony Lindgren wrote:
Hi,
* Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> [210428 14:10]:
Based on my experience on the camera and display side with devices that
are made of multiple components, suspend and resume are best handled in
a controlled way by the top-level driver. Otherwise you end up having
different components suspending and resuming in random orders, and
that's a recipe for failure.
Manually suspending and resuming the components should be doable based
on the registered components. However, I'm not sure it buys much in
this case since we do have Linux driver module take care of things for
us for most part :)
The dss hardware is really a private interconnect with a control module,
and a collection of various mostly independent display output device
modules.
We also have the interconnect target module to deal with for each
module, and have the interconnect hierachy mapped in the devicetree.
So we already have Linux driver module take care of the device
hierarchy.
Because the child components are mostly independent, it should not
matter in which order they suspend and resume related to each other.
I think the remaining issue is how dispc should provide services to
the other components.
If dispc needs to be enabled to provide services to the other modules,
maybe there's some better Linux generic framework dispc could implement?
That is other than PM runtime calls for routing the signals to the
output modules? Then PM runtime can be handled private to the dispc
module.
What would be the difference? The dispc service would just call runtime
get and put, like it does now, wouldn't it?
Decoupling the system suspend and resume from PM runtime calls for
all the other dss components should still also be done IMO. But that
can be done as a separate clean-up patches after we have fixed the
$subject issue.
I don't think I still really understand why all this is needed. I mean,
obviously things don't work correctly at the moment, so maybe this patch
can be applied to fix the system suspend. But it just feels like a big
hack (the current pm_runtime_force_suspend/resume work-around feels like
a big hack too).
Why doesn't the suspend just work? Afaics, the runtime PM code in
omapdrm is fine: the dependencies work nicely, things get runtime
suspended and resumes correctly. And at system suspend, omapdrm will
disable the whole display pipeline (including bridges, panels) in a
controlled manner, which results in the appropriate runtime PM calls. I
think this should just work. But it doesn't, because... runtime PM and
system suspend don't quite work well together? Or something.
So is it something that omapdrm is doing in a wrong way, or is the PM
framework just messed up, and other drivers need to dance around the
problems too?
Can we get the omapdrm suspend/resume to run first/last, and
stop/restart the whole device from there ?
This is already the case since commit ecfdedd7da5d ("drm/omap: force
runtime PM suspend on system suspend"). We have omap_drv use
SIMPLE_DEV_PM_OPS, and the components use SET_LATE_SYSTEM_SLEEP_PM_OPS.
So omap_drv suspends first and resumes last. The order should not
matter for other components. Well that is as long as we can deal
with dispc providing resources.
I think we really should also change omap_drv use prepare/complete ops,
and have the components use standard SIMPLE_DEV_PM_OPS. That still
won't help with PM runtime related issues for system suspend and
resume though, but leaves out the need for late pm ops.
Why do we need to do the above? What would omapdrm do in prepare &
complete? Why would we use SIMPLE_DEV_PM_OPS for the dss subcomponents?
Slightly off topic, but I just noticed that we're using runtime_put_sync
for some reason. Found 0eaf9f52e94f756147dbfe1faf1f77a02378dbf9. I've
been fighting with system suspend for a long time =).
I wonder if using non-sync version would remove the EBUSY problem...
Tomi