Hi Daniel, On Wednesday 30 Nov 2016 10:12:01 Daniel Vetter wrote: > On Wed, Nov 30, 2016 at 10:25:57AM +0200, Laurent Pinchart wrote: > > On Wednesday 30 Nov 2016 09:13:00 Daniel Vetter wrote: > >> On Wed, Nov 30, 2016 at 12:06:25AM +0200, Laurent Pinchart wrote: > >>> On Tuesday 29 Nov 2016 23:12:56 Jyri Sarha wrote: > >>>> Store the module that provides the bridge and adjust its refcount > >>>> accordingly. The bridge module unload should not be allowed while the > >>>> bridge is attached. > >>>> > >>>> Signed-off-by: Jyri Sarha <jsarha@xxxxxx> > >>>> --- > >>>> > >>>> drivers/gpu/drm/drm_bridge.c | 15 ++++++++++++--- > >>>> include/drm/drm_bridge.h | 11 ++++++++++- > >>>> 2 files changed, 22 insertions(+), 4 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/drm_bridge.c > >>>> b/drivers/gpu/drm/drm_bridge.c > >>>> index 0ee052b..36d427b 100644 > >>>> --- a/drivers/gpu/drm/drm_bridge.c > >>>> +++ b/drivers/gpu/drm/drm_bridge.c > > > > [snip] > > > >>>> @@ -114,6 +118,9 @@ int drm_bridge_attach(struct drm_device *dev, > >>>> struct drm_bridge *bridge) > >>>> if (bridge->dev) > >>>> return -EBUSY; > >>>> > >>>> + if (!try_module_get(bridge->module)) > >>>> + return -ENOENT; > >>> > >>> Isn't this still racy ? What happens if the module is removed right > >>> before this call ? Won't the bridge object be freed, and this code then > >>> try to call try_module_get() on freed memory ? > >>> > >>> To fix this properly I think we need to make the bridge object > >>> refcounted, with a release callback to signal to the bridge driver that > >>> memory can be freed. The refcount should be increased in > >>> of_drm_find_bridge(), and decreased in a new drm_bridge_put() function > >>> (the "fun" part will be to update drivers to call that :-S). > >>> > >>> The module refcount still needs to be increased in drm_bridge_attach() > >>> like you do here, but you'll need to protect it with bridge_lock to > >>> avoid a race between try_module_get() and drm_bridge_remove(). > >> > >> Hm right, I thought _attach is the function called directly, but we do > >> lookup+attach. Might be good to have an of helper which does the > >> lookup+attach and then drops the reference. We can do that as long as > >> attach/detach holds onto a reference like the one below. > >> > >> And I don't think we need a new bridge refcount, we can just us > >> try_module_get/put for that (but wrapped into helpers ofc). > > > > The bridge refcount and the module refcount serve two different purposes. > > The first one addresses the unbind race by preventing the object from > > being freed while still referenced, to avoid faulty data memory accesses > > (note that increasing a module refcount doesn't prevent unbinding the > > module from the device). The second one addresses the module unload race > > by preventing code from being unloaded while still being reachable > > through function pointers, avoiding faulty text memory accesses (as well > > as data memory accesses to any .data or .rodata section in the module, if > > those are referenced through pointers in the bridge object). > > > > We thus need both types of refcounting, with the former tied to the lookup > > operation and the latter to the attach operation (even though we could > > handle module refcounting at lookup time as well if preferred, but in a > > well-behaved system the bridge callbacks should not be invoked before > > attach time). > > So you're saying struct drm_bridge should embed a struct device? Or at > least kobj? Handling that race is one of the reasons we have them ... I'm thinking about kref + a release callback. The really fun part to solve this in the long term will be to teach the DRM core about encoder hot-unplug :-) In the meantime the minimum we need is to allow the system to fail safely when a bridge is unplugged, which will involve safe recovery from .atomic_commit() failures. -- Regards, Laurent Pinchart _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel