Re: Armada DRM: bridge with componentized devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 16, 2019 at 11:44 PM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>
> On Wednesday, January 16, 2019 7:42:45 PM CET Daniel Vetter wrote:
> > On Tue, Jan 15, 2019 at 11:47 PM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> > >
> > > [CC Greg]
> > >
> > > On Tuesday, January 15, 2019 1:04:02 AM CET Rafael J. Wysocki wrote:
> > > > [CC Lukas and linux-pm]
> > > >
> > > > On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> > > > > <linux@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > [cut]
> > > >
> > > > > >
> > > > > > This thread is discussing how to deal with Armada DRM, its use of the
> > > > > > component helper, TDA998x's hybrid use of the component helper and
> > > > > > DRM bridges.
> > > > > >
> > > > > > Currently, DRM bridges register into the DRM system and are added to
> > > > > > a global list of bridges.  When a DRM driver binds, it looks up the
> > > > > > DRM bridge and attaches to it.  There is no way in generic DRM code
> > > > > > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > > > > > consequently a DRM driver may continue trying to call functions in
> > > > > > the DRM bridge driver using memory that has already been freed.
> > > > > >
> > > > > > We had similar issues with imx-drm, and a number of years ago at a
> > > > > > kernel summit, this was discussed, and I proposed a system which is
> > > > > > now known as the component helper.  This handles the problem of a
> > > > > > multi-component system.
> > > > > >
> > > > > > However, DRM bridge was already established, and there appears to be
> > > > > > no desire to convert DRM bridges and DRM drivers to use the component
> > > > > > helper.
> > > > > >
> > > > > > We are presently in the situation where Armada DRM is incompatible
> > > > > > with the DRM bridges way of doing things, and making it compatible
> > > > > > essentially means introducing potential use-after-free bugs into the
> > > > > > code.
> > > > > >
> > > > > > Device links in their stateful form appear to provide an alternative
> > > > > > to the component helper, in that a stateful device link will remove
> > > > > > consumers of a resource when the supplier is going away - which is
> > > > > > exactly the problem which the component helper is solving.  The
> > > > > > difference is that device links look like being a cleaner solution.
> > > > > >
> > > > > > Just like the component helper, a stateful link would unbind the
> > > > > > consumer of a resource when the supplier goes away - which is exactly
> > > > > > the behaviour we're wanting.
> > > > > >
> > > > > > The problem is that the connection between various drivers is only
> > > > > > really known when the drivers obtain their resources, and the
> > > > > > following can happen:
> > > > > >
> > > > > > supplier                consumer
> > > > > >
> > > > > > probe()
> > > > > >  alloc
> > > > > >                         probe()
> > > > > >  publish
> > > > > >                         obtain supplier's resource
> > > > > >  return
> > > > > >
> > > > > > Where things go wrong is if a stateful link is created when the
> > > > > > consumer obtains the suppliers resource before the supplier has
> > > > > > finished probing - which from what's been said is illegal.
> > > > >
> > > > > It just doesn't work (which means "invalid" rather than "illegal" I
> > > > > guess, but whatever :-)).
> > > > >
> > > > > Admittedly, the original design didn't take this particular case into
> > > > > account and I'm not actually sure how it can be taken into account
> > > > > either.
> > > > >
> > > > > If the link is created by the consumer in the scenario above, its
> > > > > status will be updated twice in a row after the consumer probe returns
> > > > > and after the supplier probe returns.  It looks like this update would
> > > > > need to work regardless of the order it which this happened which
> > > > > sounds somewhat challenging.  I would need to think about that.
> > > >
> > > > So if I'm not mistaken it can be made work with the help of the
> > > > (completely untested) attached patch (of course, the documentation
> > > > would need to be updated too).
> > > >
> > > > The key observation here is that it should be fine to create a link
> > > > from the consumer driver's probe while the supplier is still probing
> > > > if the consumer has some way to find out that the supplier is
> > > > functional at that point (like when it has published itself already in
> > > > your example above).  The initial state of the link can be "consumer
> > > > probe" in that case and I don't see a reason why that might not work.
> > > >
> > >
> > > Below is a more complete patch that should take all of the corner cases
> > > into account unless I have missed anything.  Testing it would be
> > > appreciated. :-)
> > >
> > > ---
> > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > Subject: [PATCH] driver core: Fix adding device links to probing suppliers
> > >
> > > Currently, it is not valid to add a device link from a consumer
> > > driver ->probe() callback to a supplier that is still probing too, but
> > > generally this is a valid use case.  For example, if the consumer has
> > > just acquired a resource that can only be available when the supplier
> > > is functional, adding a device link to that supplier right away
> > > should be safe (and even desirable arguably), but device_link_add()
> > > doesn't handle that case correctly and the initial state of the link
> > > created by it is wrong then.
> > >
> > > To address this problem, change the initial state of device links
> > > added between a probing supplier and a probing consumer to
> > > DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
> > > skip such links on the supplier side.
> > >
> > > With this change, if the supplier probe completes first,
> > > device_links_driver_bound() called for it will skip the link state
> > > update and when it is called for the consumer, the link state will
> > > be updated to "active".  In turn, if the consumer probe completes
> > > first, device_links_driver_bound() called for it will change the
> > > state of the link to "active" and when it is called for the
> > > supplier, the link status update will be skipped.
> > >
> > > However, in principle the supplier or consumer probe may still fail
> > > after the link has been added, so modify device_links_no_driver() to
> > > change device links in the "active" or "consumer probe" state to
> > > "dormant" on the supplier side and update __device_links_no_driver()
> > > to change the link state to "available" only if it is "consumer
> > > probe" or "active".
> > >
> > > Then, if the supplier probe fails first, the leftover link to the
> > > probing consumer will become "dormant" and device_links_no_driver()
> > > called for the consumer (when its probe fails) will clean it up.
> > > In turn, if the consumer probe fails first, it will either drop the
> > > link, or change its state to "available" and, in the latter case,
> > > when device_links_no_driver() is called for the supplier, it will
> > > update the link state to "dormant".  [If the supplier probe fails,
> > > but the consumer probe succeeds, which should not happen as long as
> > > the consumer driver is correct, the link still will be around, but
> > > it will be "dormant" until the supplier is probed again.]
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > Pulling in a bunch of drm_bridge and drm_panel people. See huge
> > discussion upthread, this here is Rafael's idea for making device
> > links more useful. Would be great if someone could test this out with
> > panel/bridge dependencies.
> >
> > I guess this here isn't yet solving the reprobe issue where the
> > provider was unloaded and reloaded (which should cause the consumer to
> > reprobe too, with the EPROBE_DEFERRED logic). I think that was the
> > other issue we've hit.
>
> I don't think it is addressed here.
>
> Can anyone please explain to me what happens in more detail?

My understanding (and this might be wrong, Russell, Andrzej, ... pls
correct) is that the following sequence goes wrong:

- componentized driver (both producer and consumer) fully loaded&working
- you unbind the producer/one of the components
- component framework or driver core through device_link also unbinds
the master/consumer
- you reload/rebind the component
- with the component framework this results in the master being
rebound, and the overall driver working again. With device_links
nothing happens.

I think there was a patch floating around once to put drivers unbound
due to device_links onto the deferred probe list, so that the next
time something is bound they have a chance to rebind. But I can't find
that patch anymore, maybe someone else has the link still?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux