On Thu, Oct 30, 2014 at 10:09:28AM +0000, Russell King - ARM Linux wrote: > On Thu, Oct 30, 2014 at 11:01:02AM +0100, Andrzej Hajda wrote: > > On 10/29/2014 10:14 AM, Thierry Reding wrote: > > > On Wed, Oct 29, 2014 at 09:57:02AM +0100, Daniel Vetter wrote: > > >> I think we nee try_get_module for the code and kref on the actual data > > >> structures. > > > > > > Agreed, that should do the trick. We'd probably need some sort of logic > > > to also make operations return something like -ENODEV when the > > > underlying device has disappeared. I think David had introduced > > > something similar for DRM device not so long ago? > > > > If the underlying device disappears it would be good to receive > > notification anyway to trigger DRM HPD event. And if we have the > > notification, we can release references to the device smoothly. We do > > not need to play tricky games with krefs, zombie data and module > > refcounting. > > Any solution which thinks it needs to lock modules into the core is > fundamentally broken. It totally misses the point. > > While you can lock a module into the core using try_get_module(), that > doesn't stop the device itself being unbound from a driver. Soo many > people have fallen into that trap. They write their device driver, > along with some kind of framework which they make use try_get_module(). > They think its safe. When you then echo the device name into the > driver's unbind sysfs file, all hell breaks loose and the kernel oopses. > > try_get_module is /totally/ useless for ensuring that things stick around. > > The reality is that you can't make devices stick around. Once that > remove callback from the driver layer is called, that's it, the device > _is_ going away whether you like it or not. You can't stop it. It's > no good returning -EBUSY, because the remove return code is ignored. > > What's more scarey is when you consider that in a real hotplug > situation, when the remove callback is called, the device has > /already/ gone. > > So please, stop thinking that try_get_module() has some magic solution. > Any "solution" to device lifetimes using try_get_module() totally misses > the problem, and is just mere obfuscation and actually a bug in itself. We need proper refcounting ofc, but we also need to make sure that as long as the thing is around the text section for the final unref (at least that) doesn't go poof. I'd prefer if the framework takes care of that detail and drivers could just supply a THIS_MODULE at the right place. But fully agree on your overall point that try_get_module alone is pure snake oil. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel