On Sat, 2012-03-17 at 21:15 +0000, Russell King - ARM Linux wrote: > And the reason is that a platform _driver_ (omapdss_dss) is being > registered while a platform device (omapdss) is being probed. > > This is a very bad idea. There is absolutely no reason to register > drivers from within a probe function - to put it another way, this > code is absolutely insane. > > Why? Because you're destroying the whole idea that drivers only ever > get registered once. If you happen to have two omapdss devices (okay > that probably won't happen yet) then you'll register those device > structures twice which will cause all hell to break lose. > > Moreover - and this is why it's failing - when devices are probed, their > mutex is held. But not just _their_ mutex, but also their direct parent's > mutex as well. > > So, when the omapdss_dss driver is registered while the omapdss device is > being probed, and there's already an omapdss_dss platform device present, > the driver model tries to bind the omapdss_dss platform device with the > newly registered omapdss_dss platform driver. > > That binding wants to take the mutex on the omapdss device, but wait, > that's already held by the thread registering the omapdss_dss platform > driver. Hence, deadlock. > > This mess has been created by all those > "DSS2: xxx: create platform_driver, move init, exit to driver" > > commits, and they're all _wrong_ for the above reason. Yep, it's totally broken. It's been working by luck, and nobody has paid attention to it. I noticed this while I was working with device tree adaptation, and the patches here fix the issue: http://marc.info/?l=linux-omap&m=133112435224731&w=2 I wasn't planning to merge them yet, as it's quite late in the release cycle, but it seems I have to. I think I'll drop some of the unrelated patches (taal and tfp410 related) from that series, as they are just cleanup-churn. > However, I doubt simply moving the driver registration calls out of the > probe function will be enough - "OMAP: DSS2: Fix init and unit sequence" > hints that there's a dependence in the driver initialization order. > That's another finger pointing at the approach being wrong, because > there is _no_ guarantee as to the order in which drivers or devices are > probed by the driver model. True. I think the best would be to have a dynamic approach, where the subdrivers would register themselves somewhere, and it the order wouldn't matter. That would even allow compiling the subdrivers as separate modules. But that's a bigger work item, so what I did in the series above is that I changed platform_driver_register()s to platform_driver_probe()s. All the DSS subdevices are present at boot time and are non-hotpluggable, so I think that should work fine. However, there's one problem with this approach that is present in the code: we don't know if platform_driver_probe() returns an error because there are no devices for the driver (e.g. no HDMI on OMAP3), or because the probe of the driver failed. In the former case we don't need to care about the error, but in the latter case we should do something. I'll try to find a quick solution for this also. Tomi
Attachment:
signature.asc
Description: This is a digitally signed message part