Re: Armada DRM: bridge with componentized devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 8, 2019 at 4:07 PM Russell King - ARM Linux
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > >> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > >>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > >>>> Issues with device links have nothing to do with hotplugging, they are
> > >>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > >>>> slightly different of lifetime of device links, and this is racy even if
> > >>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > >>>> no associated struct device) assuming we can reuse its parent to create
> > >>>> device link results in circular dependencies.
> > >>> How about having the device links created depending on whether the
> > >>> main drm driver wants them or not - that would mean that Exynos
> > >>> could continue avoiding them, but others that want them can have
> > >>> the links?
> > >>
> > >> That should be OK for Exynos. But regardless of Exynos device_links at
> > >> the current state will not work correctly with bridges/panels as I
> > >> described earlier.
> > > However, I'm not sure you're correct with your interpretation of the
> > > documentation.  Firstly, the documentation says:
> > >
> > >    Another example for an inconsistent state would be a device link that
> > >    represents a driver presence dependency, yet is added from the consumer's
> > >    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > >    known about the device link earlier, it wouldn't have probed the consumer
> > >    in the first place. The onus is thus on the consumer to check presence of
> > >    the supplier after adding the link, and defer probing on non-presence.
> > >
> > > This is fine - if we add the device link from of_drm_find_bridge(), we
> > > will be in the consumer's ->probe function, and the supplier must have
> > > been probed for us to find the struct drm_bridge.
> >
> >
> > Supplier usually is registered in it's probe time, so there is short
> > period of time when supplier is available, but the probe is not yet
> > finished - quite unsafe, but not impossible, especially if there exists
> > some kind of notifications about resource appearance (MIPI-DSI case).
>
> At some point during the supplier probe, the resource becomes available
> to consumers.  At that point, device links can be setup before the
> supplier has finished probing.  So any driver that provides resources
> to another driver will, at some point during the provider's probe,
> make resources available, and therefore be a candidate for device links
> to be created _before_ the probe function has returned.
>
> What is a problem is if the provider publishes resources, and then fails
> its probe function, causing the resource to disappear.
>
> Taking DRM bridge, drm_bridge_add() returns void - it never fails.
> Most bridge drivers do drm_bridge_add() as the very last step before
> returning zero from their probe function.  There are a few exceptions.

Yup, that's how it should be. And this is a general rule: Only ever
publish your interface (whether userspace dev node, or some kernel
internal thing) once setup is 100% done and nothing can fail anymore.
If you call drm_bridge_add before your bridge is fully set up, that's
a bug. Same way that calling drm_dev_register before all the sub-parts
(except drm_connector, which can be hotplugged) is a bug. That
ordering issue is why the ->load callback is deprecated, since it's
fundamentally unsafe (but we did paper over the worst races with
opening the chardev using some global locking).

If you want more examples from within drm look at gem bo or
drm_framebuffer: These also only get registered once fully set up, and
the registration is the very last thing addfb or dumb_create do. Heck
even for hotplugged drm_connector the same rule holds: First set it up
(including panel behind it and everything), then register it.

> megachips-stdpxxxx-ge-b850v3-fw.c is already buggy - it calls
> devm_request_threaded_irq(), which if it fails, the bridge is left added
> to the global list of bridges.  Since this structure was allocated with
> devm_kzalloc(), it will be freed when the probe function fails.  So,
> any failure here (eg a deferred probe because the IRQ controller is not
> available) already creates a latent bug just waiting to bite.
>
> tc358764.c is another case where the bridge is published prior to
> initialisation completion, but it looks like that's under the control
> of the "host" (consumer) driver.
>
> mtk_hdmi.c enables clocks for audio after registering the bridge, which
> may fail, but are unlikely to.  However, moving them prior to
> drm_bridge_add() probably won't hurt and avoids the "published
> interfaces which then vanish" problem.
>
> Then we have exynos_drm_mic.c and tda998x_drv.c, but I think the
> latter's error path after drm_bridge_add() can be eliminated if we
> transitioned to device links instead of the component helper.
>
> Outside DRM, take regulators - at some point during a regulator
> supplier's probe function, the resource will be published, and as soon
> as it is, it's available for regulator_get() to find it and setup a
> device link before the probe function has finished.
>
> From what I can tell, in the situation where a supplier makes resources
> available, the consumer binds to the resource, and then the supplier
> goes away, the device link will remain, and the consumer will not be
> unbound, which leads to an unexpected state.  The solution to this is
> the age old kernel rule of "don't publish your interfaces until you've
> completed initialisation".  IOW, publish at a point where you aren't
> going to fail.

Fully agreed. Any driver that does something else is suspect, and
definitely shouldn't be held up as an example of what we should
support in generic/shared code. There's just too many surprises
lurking otherwise (and so if you insist on doing things like that, you
get to hand-roll a lot more of your driver code).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux