Den 24.01.2019 18.57, skrev Daniel Vetter: > On Thu, Jan 24, 2019 at 6:46 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: >> >> On Thu, Jan 24, 2019 at 11:43:12AM +0100, Daniel Vetter wrote: >>> On Wed, Jan 23, 2019 at 11:54:07AM +0100, Noralf Trønnes wrote: >>>> >>>> >>>> Den 22.01.2019 20.30, skrev Daniel Vetter: >>>>> On Tue, Jan 22, 2019 at 8:07 PM Noralf Trønnes <noralf@xxxxxxxxxxx> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Den 22.01.2019 10.32, skrev Daniel Vetter: >>>>>>> On Mon, Jan 21, 2019 at 01:21:46PM +0100, Noralf Trønnes wrote: >>>>>>>> >>>>>>>> >>>>>>>> Den 21.01.2019 10.55, skrev Daniel Vetter: >>>>>>>>> On Mon, Jan 21, 2019 at 10:10:14AM +0100, Daniel Vetter wrote: >>>>>>>>>> On Sun, Jan 20, 2019 at 12:43:08PM +0100, Noralf Trønnes wrote: >>>>>>>>>>> This adds resource managed (devres) versions of drm_dev_init() and >>>>>>>>>>> drm_dev_register(). >>>>>>>>>>> >>>>>>>>>>> Also added is devm_drm_dev_register_with_fbdev() which sets up generic >>>>>>>>>>> fbdev emulation as well. >>>>>>>>>>> >>>>>>>>>>> devm_drm_dev_register() isn't exported since there are no users. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Noralf Trønnes <noralf@xxxxxxxxxxx> >>>>>>>>>> >>>>>>>> >>>>>>>> <snip> >>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c >>>>>>>>>>> index 381581b01d48..12129772be45 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/drm_drv.c >>>>>>>>>>> +++ b/drivers/gpu/drm/drm_drv.c >>>>>>>>>>> @@ -36,6 +36,7 @@ >>>>>>>>>>> >>>>>>>>>>> #include <drm/drm_client.h> >>>>>>>>>>> #include <drm/drm_drv.h> >>>>>>>>>>> +#include <drm/drm_fb_helper.h> >>>>>>>>>>> #include <drm/drmP.h> >>>>>>>>>>> >>>>>>>>>>> #include "drm_crtc_internal.h" >>>>>>>>>>> @@ -871,6 +872,111 @@ void drm_dev_unregister(struct drm_device *dev) >>>>>>>>>>> } >>>>>>>>>>> EXPORT_SYMBOL(drm_dev_unregister); >>>>>>>>>>> >>>>>>>>>>> +static void devm_drm_dev_init_release(void *data) >>>>>>>>>>> +{ >>>>>>>>>>> + drm_dev_put(data); >>>>>>>>> >>>>>>>>> We need drm_dev_unplug() here, or this isn't safe. >>>>>>>> >>>>>>>> This function is only used to cover the error path if probe fails before >>>>>>>> devm_drm_dev_register() is called. devm_drm_dev_register_release() is >>>>>>>> the one that calls unplug. There are comments about this in the functions. >>>>>>> >>>>>>> I think I get a prize for being ignorant and blind :-/ >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +/** >>>>>>>>>>> + * devm_drm_dev_init - Resource managed drm_dev_init() >>>>>>>>>>> + * @parent: Parent device object >>>>>>>>>>> + * @dev: DRM device >>>>>>>>>>> + * @driver: DRM driver >>>>>>>>>>> + * >>>>>>>>>>> + * Managed drm_dev_init(). The DRM device initialized with this function is >>>>>>>>>>> + * automatically released on driver detach. You must supply a >>>>>>>>> >>>>>>>>> I think a bit more clarity here would be good: >>>>>>>>> >>>>>>>>> "... automatically released on driver unbind by callind drm_dev_unplug()." >>>>>>>>> >>>>>>>>>>> + * &drm_driver.release callback to control the finalization explicitly. >>>>>>>>> >>>>>>>>> I think a loud warning for these is in order: >>>>>>>>> >>>>>>>>> "WARNING: >>>>>>>>> >>>>>>>>> "In generally it is unsafe to use devm functions for drm structures >>>>>>>>> because the lifetimes of &drm_device and the underlying &device do not >>>>>>>>> match. This here works because it doesn't immediately free anything, but >>>>>>>>> only calls drm_dev_unplug(), which internally decrements the &drm_device >>>>>>>>> refcount through drm_dev_put(). >>>>>>>>> >>>>>>>>> "All other drm structures must still be explicitly released in the >>>>>>>>> &drm_driver.release callback." >>>>>>>>> >>>>>>>>> While thinking about this I just realized that with this design we have no >>>>>>>>> good place to call drm_atomic_helper_shutdown(). Which we need to, or all >>>>>>>>> kinds of things will leak badly (connectors, fb, ...), but there's no >>>>>>>>> place to call it: >>>>>>>>> - unbind is too early, since we haven't yet called drm_dev_unplug, and the >>>>>>>>> drm_dev_unregister in there must be called _before_ we start to shut >>>>>>>>> down anything. >>>>>>>>> - drm_driver.release is way too late. >>>>>>>>> >>>>>>>>> Ofc for a real hotunplug there's no point in shutting down the hw (it's >>>>>>>>> already gone), but for a driver unload/unbind it would be nice if this >>>>>>>>> happens automatically and in the right order. >>>>>>>>> >>>>>>>>> So not sure what to do here really. >>>>>>>> >>>>>>>> How about this change: (it breaks the rule of pulling helpers into the >>>>>>>> core, so maybe we should put the devm_ functions into the simple KMS >>>>>>>> helper instead?) >>>>>>> >>>>>>> Yeah smells a bit much like midlayer ... What would work is having a pile >>>>>>> more devm_ helper functions, so that we onion-unwrap everything correctly, >>>>>>> and in the right order. So: >>>>>>> >>>>>>> - devm_drm_dev_init (always does a drm_dev_put()) >>>>>>> >>>>>>> - devm_drm_poll_enable (shuts down the poll helper with a devm action) >>>>>>> >>>>>>> - devm_drm_mode_config_reset (does an atomic_helper_shutdown() as it's cleanup action) >>>>>>> >>>>>>> - devm_drm_dev_register (grabs an additional drm_dev_get() reference so it >>>>>>> can call drm_dev_unplug() unconditionally). >>>>>>> >>>>>> >>>>>> Beautiful! I really like this, it's very flexible. >>>>>> >>>>>> Where should devm_drm_mode_config_reset() live? It will pull in the >>>>>> atomic helper... >>>>> >>>>> I think a new drm_devm.c helper would be nice for all this stuff. >>>>> Especially since you can't freely mix devm-based setup/cleanup with >>>>> normal cleanup I think it'd be good to have it all together in one >>>>> place. And perhaps even a code example in the DOC: overview. >>>>> >>>>>>> We'd need to make sure some of the cleanup actions dtrt when the device is >>>>>>> gone, but I think we can achieve that by liberally sprinkling >>>>>>> drm_dev_enter/exit over them, e.g. the the cleanup action for >>>>>>> drm_mode_config_reset would be: >>>>>>> >>>>>>> { >>>>>>> if (drm_dev_enter()) >>>>>>> return; >>>>>>> >>>>>>> drm_atomic_helper_shutdown(); >>>>>>> >>>>>>> drm_dev_exit(); >>>>>>> } >>>>>>> >>>>>> >>>>>> drm_dev_enter() can only be used to check whether the drm_device is >>>>>> registered or not, it doesn't say anything about the state of the parent >>>>>> device. >>>>>> >>>>>> All we know is that the device is being unbound from the driver, we >>>>>> don't know if it's the device that's being removed or if it's the driver >>>>>> that's unregistered. >>>>> >>>>> You're right, both paths will have called drm_dev_unplug by then. >>>>> Silly me. I really liked my idea :-) >>>>> >>>>>> I have looked at the various call chains: >>>>>> >>>>>> driver_unregister -> >>>>>> bus_remove_driver -> >>>>>> driver_detach -> >>>>>> device_release_driver_internal >>>>>> >>>>>> device_unregister -> >>>>>> device_del -> >>>>>> bus_remove_device -> >>>>>> device_release_driver -> >>>>>> device_release_driver_internal >>>>>> >>>>>> sysfs: unbind_store -> >>>>>> device_release_driver -> >>>>>> device_release_driver_internal >>>>>> >>>>>> The only way I've found to differentiate between these in a cleanup >>>>>> action is that device_del() uses the bus notifier to signal >>>>>> BUS_NOTIFY_DEL_DEVICE before calling bus_remove_device(). Such a >>>>>> notifier could be used to set a drm_device->parent_removed flag. >>>>> >>>>> Hm, this might upset Greg KH's code taste ... maybe there's a better >>>>> way to do this, but best to prototype a patch with this, send it to >>>>> him and ask how to :-) >>>>> >>>> >>>> I'll leave this to the one that needs it. The tinydrm drivers doesn't >>>> need to touch hw after DRM unregister. >>>> >>>>>> Why is it necessary to call drm_atomic_helper_shutdown() here? Doesn't >>>>>> everything get disabled when userspace closes? It does in my tinydrm >>>>>> world :-) >>>>> >>>>> Iirc fbdev/fbcon can result in leaks ... at least we've had patches >>>>> where drivers leaked drm_connector and drm_framebuffer objects, and >>>>> they've been fixed by calling drm_atomic_helper_shutdown() in the >>>>> unload path. Maybe this is cargo-culting, but it goes way back to >>>>> pre-atomic, where drivers called drm_helper_force_disable_all(). >>>>> >>>>> If you try to move the fbcon to your tinydrm drivers (con2fb is >>>>> apparently the cmdline tool you need, never tried it, I only switch >>>>> the kernel's console between fbcon and dummycon and back, not what >>>>> fbcon drivers itself), then I think you should be able to reproduce. >>>>> And maybe you have a better idea how to deal with this all. >>>>> >>>>> Note also that there's been proposals floating around to only close an >>>>> drm_framebuffer, not also remove it (like the current RMFB ioctl >>>>> does), with that closing userspace would not necessarily lead to a >>>>> full cleanup. >>>>> >>>>> Another thing (which doesn't apply to drm_simple_display_pipe drivers) >>>>> is if you have the display on, but no planes showing (i.e. all black). >>>>> Then all the fbs will be cleaned up, but drm_connector will be >>>>> leaking. That's a case where you need drm_atomic_helper_shutdown() >>>>> even if fbcon/fbdev isn't even enabled. >>>>> >>>> >>>> Ok, this means that I don't need to call drm_atomic_helper_shutdown() in >>>> tinydrm. DRM userspace disables the pipe on close and the generic fbdev >>>> emulation also releases everything. >>>> Even so, maybe I should use devm_drm_mode_config_reset() after all to >>>> keep drivers uniform, to avoid confusion: why doesn't he use it? >>> >>> Hm maybe there is an official way to solve this, pulling in Greg+Rafael. >>> >>> Super short summary: We want to start using devm actions to clean up drm >>> drivers. Here's the problem: >>> - For a driver unload/unbind without hotunplug, we want to properly clean >>> up the hardware and shut it all down. >> >> Then do it on probe/disconnect. >> >>> - But if the device is unplugged already, that's probably not the best >>> idea, and we only want to clean up the kernel's resources/allocations. >> >> Again, probe/disconnect will be called either way. >> >> But as you note, memory will NOT be freed by the devm stuff if you >> manually unbind a driver from a device. >> >> So don't touch hardware there, it's not going to work :) >> >>> What's the recommendation here? I see a few options: >>> >>> - Make sure everything can deal with this properly. Hotunplug can happen >>> anytime, so there's a race no matter what. >> >> Yes. >> >>> - Check with the device model whether the struct device is disappearing or >>> whether we're just dealing with a driver unbind (no idea how to do >>> that), and act accordingly. >> >> You don't know that, sorry. Just do any hardware stuff on disconnect. >> Assuming your hardware is still present :) >> >>> - Fundamental question: Touching the hw from devm actions, is that ok? If >>> not, then the pretty nifty plan laid out in this thread wont work. >> >> Nope, that's not going to work, the device could either be long gone, or >> you will not be called due to unbind happening from userspace. >> >> But really, unbind from userspace is very very rare, it's a developer >> thing mostly. Oh and a virtual driver thing, but those people are crazy >> :) >> >>> - Something completely different? >> >> Do it in disconnect :) > > Ah, I forgot to mention the important constraint :-) disconnect/unbind > should be the following sequence: > > 1. Unregister all the userspace interfaces (there's a lot of them) and > make sure all the pending ioctls are done so that from now on > userspace sees lots of -EIO (in case it still has fd open, which is > going to be the normal for hotunplug. > > 2. Shut down hw and all ongoing operations (only relevant for unbind, > but needs to be able to cope with sudden hotunplug on top anyway). > > 3. Clean up the kernel mess and release everything. > > Probe is exactly the other way round, so would perfectly fit into the > devm onion cleanup. See in the commented earlier replies above how > that would match in details, but tldr; if we have to do 2. in > disconnect, then we also have to do 1. in disconnected, and only doing > 3. through devm is almost not worth the bother. But if we could do all > three through devm then simple drivers wouldn't even need any > disconnect/unbind callback at all. That's our motivation for trying to > come up with an answer that's not "do it in disconnect". "do it in > disconnect" is how we do it all today already. > > Yes we're trying to make tiny drivers even smaller, we have enough > nowadays that this stuff would be worth it :-) > I think a solution is to say that drivers that want to touch hw on disconnect needs to use device_driver->remove to do that. This is an example driver that doesn't need to touch hw because it's so simple that userspace has disabled the pipeline: static void drm_driver_release(struct drm_device *drm) { drm_mode_config_cleanup(drm); drm_dev_fini(drm); kfree(drm); } static struct drm_driver drm_driver = { .release = drm_driver_release, /* ... */ }; static int driver_probe(struct device *dev) { struct drm_device *drm; int ret; drm = kzalloc(sizeof(*drm), GFP_KERNEL); if (!drm) return -ENOMEM; ret = devm_drm_dev_init(dev, drm, &drm_driver); if (ret) { kfree(drm); return ret; } drm_mode_config_init(drm); /* Aquire various resources, all managed by devres */ drm_mode_config_reset(drm); return devm_drm_dev_register(drm); } struct device_driver driver = { .probe = driver_probe, }; A driver that wants to touch hardware on disconnect, can look like this: static void drm_driver_release(struct drm_device *drm) { drm_mode_config_cleanup(drm); drm_dev_fini(drm); kfree(drm); } static struct drm_driver drm_driver = { .release = drm_driver_release, /* ... */ }; static int driver_probe(struct device *dev) { struct drm_device *drm; int ret; drm = kzalloc(sizeof(*drm), GFP_KERNEL); if (!drm) return -ENOMEM; ret = devm_drm_dev_init(dev, drm, &drm_driver); if (ret) { kfree(drm); return ret; } drm_mode_config_init(drm); /* Aquire various resources, all managed by devres */ drm_mode_config_reset(drm); ret = drm_dev_register(drm); if (ret) return ret; drm_dev_get(dev); /* If using drm_dev_unplug() */ dev_set_drvdata(dev, drm); return 0; } /* This function is called before devres_release_all() */ static int driver_remove(struct device *dev) { struct drm_device *drm = dev_get_drvdata(dev); drm_dev_unplug(drm); OR drm_dev_unregister(drm); drm_atomic_helper_shutdown(drm) return 0; } struct device_driver driver = { .probe = driver_probe, .remove = driver_remove, }; Noralf. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel