Hi Daniel, On Wednesday 30 Nov 2016 11:55:20 Daniel Vetter wrote: > On Wed, Nov 30, 2016 at 11:56:25AM +0200, Laurent Pinchart wrote: > > On Wednesday 30 Nov 2016 10:12:01 Daniel Vetter wrote: > >> On Wed, Nov 30, 2016 at 10:25:57AM +0200, Laurent Pinchart wrote: > >>> On Wednesday 30 Nov 2016 09:13:00 Daniel Vetter wrote: > >>>> On Wed, Nov 30, 2016 at 12:06:25AM +0200, Laurent Pinchart wrote: > >>>>> On Tuesday 29 Nov 2016 23:12:56 Jyri Sarha wrote: > >>>>>> Store the module that provides the bridge and adjust its refcount > >>>>>> accordingly. The bridge module unload should not be allowed while > >>>>>> the bridge is attached. > >>>>>> > >>>>>> Signed-off-by: Jyri Sarha <jsarha@xxxxxx> > >>>>>> --- > >>>>>> > >>>>>> drivers/gpu/drm/drm_bridge.c | 15 ++++++++++++--- > >>>>>> include/drm/drm_bridge.h | 11 ++++++++++- > >>>>>> 2 files changed, 22 insertions(+), 4 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/drm_bridge.c > >>>>>> b/drivers/gpu/drm/drm_bridge.c > >>>>>> index 0ee052b..36d427b 100644 > >>>>>> --- a/drivers/gpu/drm/drm_bridge.c > >>>>>> +++ b/drivers/gpu/drm/drm_bridge.c > >>> > >>> [snip] > >>> > >>>>>> @@ -114,6 +118,9 @@ int drm_bridge_attach(struct drm_device *dev, > >>>>>> struct drm_bridge *bridge) > >>>>>> if (bridge->dev) > >>>>>> return -EBUSY; > >>>>>> > >>>>>> + if (!try_module_get(bridge->module)) > >>>>>> + return -ENOENT; > >>>>> > >>>>> Isn't this still racy ? What happens if the module is removed right > >>>>> before this call ? Won't the bridge object be freed, and this code > >>>>> then try to call try_module_get() on freed memory ? > >>>>> > >>>>> To fix this properly I think we need to make the bridge object > >>>>> refcounted, with a release callback to signal to the bridge driver > >>>>> that memory can be freed. The refcount should be increased in > >>>>> of_drm_find_bridge(), and decreased in a new drm_bridge_put() > >>>>> function (the "fun" part will be to update drivers to call that :-S). > >>>>> > >>>>> The module refcount still needs to be increased in > >>>>> drm_bridge_attach() like you do here, but you'll need to protect it > >>>>> with bridge_lock to avoid a race between try_module_get() and > >>>>> drm_bridge_remove(). > >>>> > >>>> Hm right, I thought _attach is the function called directly, but we > >>>> do lookup+attach. Might be good to have an of helper which does the > >>>> lookup+attach and then drops the reference. We can do that as long as > >>>> attach/detach holds onto a reference like the one below. > >>>> > >>>> And I don't think we need a new bridge refcount, we can just us > >>>> try_module_get/put for that (but wrapped into helpers ofc). > >>> > >>> The bridge refcount and the module refcount serve two different > >>> purposes. The first one addresses the unbind race by preventing the > >>> object from being freed while still referenced, to avoid faulty data > >>> memory accesses (note that increasing a module refcount doesn't > >>> prevent unbinding the module from the device). The second one > >>> addresses the module unload race by preventing code from being > >>> unloaded while still being reachable through function pointers, > >>> avoiding faulty text memory accesses (as well as data memory accesses > >>> to any .data or .rodata section in the module, if those are referenced > >>> through pointers in the bridge object). > >>> > >>> We thus need both types of refcounting, with the former tied to the > >>> lookup operation and the latter to the attach operation (even though we > >>> could handle module refcounting at lookup time as well if preferred, > >>> but in a well-behaved system the bridge callbacks should not be invoked > >>> before attach time). > >> > >> So you're saying struct drm_bridge should embed a struct device? Or at > >> least kobj? Handling that race is one of the reasons we have them ... > > > > I'm thinking about kref + a release callback. The really fun part to solve > > this in the long term will be to teach the DRM core about encoder > > hot-unplug :-) In the meantime the minimum we need is to allow the system > > to fail safely when a bridge is unplugged, which will involve safe > > recovery from .atomic_commit() failures. > > Why exactly do you want to hotplug encoders? Or bridges fwiw ... since at > least only making those hotpluggable will make the uabi story easier since > they're not exposed. Ideally to avoid disabling the whole display engine when one encoder isn't available/operational. Right now we're waiting for all pieces to be available (using deferred probing or the component framework) before registering the DRM device. This means that if one bridge can't be probed successfully for any reason we'll end up having not display at all. This includes the case where the driver for the bridge is not available. If we could support dynamic hotplug of bridges, we could start with a display engine that supports a subset of the outputs, and add new outputs as they become operational. We have a similar issue when unbinding bridge devices from their driver. They obviously can't be used anymore, but we have no solution to handle that apart from unregistering the DRM device completely, as otherwise rebinding the bridge to the driver later can't be handled. -- Regards, Laurent Pinchart _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel