On Wed, Nov 30, 2016 at 11:56:25AM +0200, Laurent Pinchart wrote: > Hi Daniel, > > On Wednesday 30 Nov 2016 10:12:01 Daniel Vetter wrote: > > On Wed, Nov 30, 2016 at 10:25:57AM +0200, Laurent Pinchart wrote: > > > On Wednesday 30 Nov 2016 09:13:00 Daniel Vetter wrote: > > >> On Wed, Nov 30, 2016 at 12:06:25AM +0200, Laurent Pinchart wrote: > > >>> On Tuesday 29 Nov 2016 23:12:56 Jyri Sarha wrote: > > >>>> Store the module that provides the bridge and adjust its refcount > > >>>> accordingly. The bridge module unload should not be allowed while the > > >>>> bridge is attached. > > >>>> > > >>>> Signed-off-by: Jyri Sarha <jsarha@xxxxxx> > > >>>> --- > > >>>> > > >>>> drivers/gpu/drm/drm_bridge.c | 15 ++++++++++++--- > > >>>> include/drm/drm_bridge.h | 11 ++++++++++- > > >>>> 2 files changed, 22 insertions(+), 4 deletions(-) > > >>>> > > >>>> diff --git a/drivers/gpu/drm/drm_bridge.c > > >>>> b/drivers/gpu/drm/drm_bridge.c > > >>>> index 0ee052b..36d427b 100644 > > >>>> --- a/drivers/gpu/drm/drm_bridge.c > > >>>> +++ b/drivers/gpu/drm/drm_bridge.c > > > > > > [snip] > > > > > >>>> @@ -114,6 +118,9 @@ int drm_bridge_attach(struct drm_device *dev, > > >>>> struct drm_bridge *bridge) > > >>>> if (bridge->dev) > > >>>> return -EBUSY; > > >>>> > > >>>> + if (!try_module_get(bridge->module)) > > >>>> + return -ENOENT; > > >>> > > >>> Isn't this still racy ? What happens if the module is removed right > > >>> before this call ? Won't the bridge object be freed, and this code then > > >>> try to call try_module_get() on freed memory ? > > >>> > > >>> To fix this properly I think we need to make the bridge object > > >>> refcounted, with a release callback to signal to the bridge driver that > > >>> memory can be freed. The refcount should be increased in > > >>> of_drm_find_bridge(), and decreased in a new drm_bridge_put() function > > >>> (the "fun" part will be to update drivers to call that :-S). > > >>> > > >>> The module refcount still needs to be increased in drm_bridge_attach() > > >>> like you do here, but you'll need to protect it with bridge_lock to > > >>> avoid a race between try_module_get() and drm_bridge_remove(). > > >> > > >> Hm right, I thought _attach is the function called directly, but we do > > >> lookup+attach. Might be good to have an of helper which does the > > >> lookup+attach and then drops the reference. We can do that as long as > > >> attach/detach holds onto a reference like the one below. > > >> > > >> And I don't think we need a new bridge refcount, we can just us > > >> try_module_get/put for that (but wrapped into helpers ofc). > > > > > > The bridge refcount and the module refcount serve two different purposes. > > > The first one addresses the unbind race by preventing the object from > > > being freed while still referenced, to avoid faulty data memory accesses > > > (note that increasing a module refcount doesn't prevent unbinding the > > > module from the device). The second one addresses the module unload race > > > by preventing code from being unloaded while still being reachable > > > through function pointers, avoiding faulty text memory accesses (as well > > > as data memory accesses to any .data or .rodata section in the module, if > > > those are referenced through pointers in the bridge object). > > > > > > We thus need both types of refcounting, with the former tied to the lookup > > > operation and the latter to the attach operation (even though we could > > > handle module refcounting at lookup time as well if preferred, but in a > > > well-behaved system the bridge callbacks should not be invoked before > > > attach time). > > > > So you're saying struct drm_bridge should embed a struct device? Or at > > least kobj? Handling that race is one of the reasons we have them ... > > I'm thinking about kref + a release callback. The really fun part to solve > this in the long term will be to teach the DRM core about encoder hot-unplug > :-) In the meantime the minimum we need is to allow the system to fail safely > when a bridge is unplugged, which will involve safe recovery from > .atomic_commit() failures. Why exactly do you want to hotplug encoders? Or bridges fwiw ... since at least only making those hotpluggable will make the uabi story easier since they're not exposed. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel