Den 29.01.2019 17.50, skrev Daniel Vetter: > On Tue, Jan 29, 2019 at 03:34:46PM +0100, Noralf Trønnes wrote: >> >> >> Den 24.01.2019 18.57, skrev Daniel Vetter: >>> On Thu, Jan 24, 2019 at 6:46 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: >>>> >>>> On Thu, Jan 24, 2019 at 11:43:12AM +0100, Daniel Vetter wrote: >>>>> On Wed, Jan 23, 2019 at 11:54:07AM +0100, Noralf Trønnes wrote: >>>>>> >>>>>> >>>>>> Den 22.01.2019 20.30, skrev Daniel Vetter: >>>>>>> On Tue, Jan 22, 2019 at 8:07 PM Noralf Trønnes <noralf@xxxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Den 22.01.2019 10.32, skrev Daniel Vetter: >>>>>>>>> On Mon, Jan 21, 2019 at 01:21:46PM +0100, Noralf Trønnes wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Den 21.01.2019 10.55, skrev Daniel Vetter: >>>>>>>>>>> On Mon, Jan 21, 2019 at 10:10:14AM +0100, Daniel Vetter wrote: >>>>>>>>>>>> On Sun, Jan 20, 2019 at 12:43:08PM +0100, Noralf Trønnes wrote: >>>>>>>>>>>>> This adds resource managed (devres) versions of drm_dev_init() and >>>>>>>>>>>>> drm_dev_register(). >>>>>>>>>>>>> >>>>>>>>>>>>> Also added is devm_drm_dev_register_with_fbdev() which sets up generic >>>>>>>>>>>>> fbdev emulation as well. >>>>>>>>>>>>> >>>>>>>>>>>>> devm_drm_dev_register() isn't exported since there are no users. >>>>>>>>>>>>> >>>>>>>>>>>>> Signed-off-by: Noralf Trønnes <noralf@xxxxxxxxxxx> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> <snip> >>>>>>>>>> >>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c >>>>>>>>>>>>> index 381581b01d48..12129772be45 100644 >>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_drv.c >>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_drv.c >>>>>>>>>>>>> @@ -36,6 +36,7 @@ >>>>>>>>>>>>> >>>>>>>>>>>>> #include <drm/drm_client.h> >>>>>>>>>>>>> #include <drm/drm_drv.h> >>>>>>>>>>>>> +#include <drm/drm_fb_helper.h> >>>>>>>>>>>>> #include <drm/drmP.h> >>>>>>>>>>>>> >>>>>>>>>>>>> #include "drm_crtc_internal.h" >>>>>>>>>>>>> @@ -871,6 +872,111 @@ void drm_dev_unregister(struct drm_device *dev) >>>>>>>>>>>>> } >>>>>>>>>>>>> EXPORT_SYMBOL(drm_dev_unregister); >>>>>>>>>>>>> >>>>>>>>>>>>> +static void devm_drm_dev_init_release(void *data) >>>>>>>>>>>>> +{ >>>>>>>>>>>>> + drm_dev_put(data); >>>>>>>>>>> >>>>>>>>>>> We need drm_dev_unplug() here, or this isn't safe. >>>>>>>>>> >>>>>>>>>> This function is only used to cover the error path if probe fails before >>>>>>>>>> devm_drm_dev_register() is called. devm_drm_dev_register_release() is >>>>>>>>>> the one that calls unplug. There are comments about this in the functions. >>>>>>>>> >>>>>>>>> I think I get a prize for being ignorant and blind :-/ >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> +} >>>>>>>>>>>>> + >>>>>>>>>>>>> +/** >>>>>>>>>>>>> + * devm_drm_dev_init - Resource managed drm_dev_init() >>>>>>>>>>>>> + * @parent: Parent device object >>>>>>>>>>>>> + * @dev: DRM device >>>>>>>>>>>>> + * @driver: DRM driver >>>>>>>>>>>>> + * >>>>>>>>>>>>> + * Managed drm_dev_init(). The DRM device initialized with this function is >>>>>>>>>>>>> + * automatically released on driver detach. You must supply a >>>>>>>>>>> >>>>>>>>>>> I think a bit more clarity here would be good: >>>>>>>>>>> >>>>>>>>>>> "... automatically released on driver unbind by callind drm_dev_unplug()." >>>>>>>>>>> >>>>>>>>>>>>> + * &drm_driver.release callback to control the finalization explicitly. >>>>>>>>>>> >>>>>>>>>>> I think a loud warning for these is in order: >>>>>>>>>>> >>>>>>>>>>> "WARNING: >>>>>>>>>>> >>>>>>>>>>> "In generally it is unsafe to use devm functions for drm structures >>>>>>>>>>> because the lifetimes of &drm_device and the underlying &device do not >>>>>>>>>>> match. This here works because it doesn't immediately free anything, but >>>>>>>>>>> only calls drm_dev_unplug(), which internally decrements the &drm_device >>>>>>>>>>> refcount through drm_dev_put(). >>>>>>>>>>> >>>>>>>>>>> "All other drm structures must still be explicitly released in the >>>>>>>>>>> &drm_driver.release callback." >>>>>>>>>>> >>>>>>>>>>> While thinking about this I just realized that with this design we have no >>>>>>>>>>> good place to call drm_atomic_helper_shutdown(). Which we need to, or all >>>>>>>>>>> kinds of things will leak badly (connectors, fb, ...), but there's no >>>>>>>>>>> place to call it: >>>>>>>>>>> - unbind is too early, since we haven't yet called drm_dev_unplug, and the >>>>>>>>>>> drm_dev_unregister in there must be called _before_ we start to shut >>>>>>>>>>> down anything. >>>>>>>>>>> - drm_driver.release is way too late. >>>>>>>>>>> >>>>>>>>>>> Ofc for a real hotunplug there's no point in shutting down the hw (it's >>>>>>>>>>> already gone), but for a driver unload/unbind it would be nice if this >>>>>>>>>>> happens automatically and in the right order. >>>>>>>>>>> >>>>>>>>>>> So not sure what to do here really. >>>>>>>>>> >>>>>>>>>> How about this change: (it breaks the rule of pulling helpers into the >>>>>>>>>> core, so maybe we should put the devm_ functions into the simple KMS >>>>>>>>>> helper instead?) >>>>>>>>> >>>>>>>>> Yeah smells a bit much like midlayer ... What would work is having a pile >>>>>>>>> more devm_ helper functions, so that we onion-unwrap everything correctly, >>>>>>>>> and in the right order. So: >>>>>>>>> >>>>>>>>> - devm_drm_dev_init (always does a drm_dev_put()) >>>>>>>>> >>>>>>>>> - devm_drm_poll_enable (shuts down the poll helper with a devm action) >>>>>>>>> >>>>>>>>> - devm_drm_mode_config_reset (does an atomic_helper_shutdown() as it's cleanup action) >>>>>>>>> >>>>>>>>> - devm_drm_dev_register (grabs an additional drm_dev_get() reference so it >>>>>>>>> can call drm_dev_unplug() unconditionally). >>>>>>>>> >>>>>>>> >>>>>>>> Beautiful! I really like this, it's very flexible. >>>>>>>> >>>>>>>> Where should devm_drm_mode_config_reset() live? It will pull in the >>>>>>>> atomic helper... >>>>>>> >>>>>>> I think a new drm_devm.c helper would be nice for all this stuff. >>>>>>> Especially since you can't freely mix devm-based setup/cleanup with >>>>>>> normal cleanup I think it'd be good to have it all together in one >>>>>>> place. And perhaps even a code example in the DOC: overview. >>>>>>> >>>>>>>>> We'd need to make sure some of the cleanup actions dtrt when the device is >>>>>>>>> gone, but I think we can achieve that by liberally sprinkling >>>>>>>>> drm_dev_enter/exit over them, e.g. the the cleanup action for >>>>>>>>> drm_mode_config_reset would be: >>>>>>>>> >>>>>>>>> { >>>>>>>>> if (drm_dev_enter()) >>>>>>>>> return; >>>>>>>>> >>>>>>>>> drm_atomic_helper_shutdown(); >>>>>>>>> >>>>>>>>> drm_dev_exit(); >>>>>>>>> } >>>>>>>>> >>>>>>>> >>>>>>>> drm_dev_enter() can only be used to check whether the drm_device is >>>>>>>> registered or not, it doesn't say anything about the state of the parent >>>>>>>> device. >>>>>>>> >>>>>>>> All we know is that the device is being unbound from the driver, we >>>>>>>> don't know if it's the device that's being removed or if it's the driver >>>>>>>> that's unregistered. >>>>>>> >>>>>>> You're right, both paths will have called drm_dev_unplug by then. >>>>>>> Silly me. I really liked my idea :-) >>>>>>> >>>>>>>> I have looked at the various call chains: >>>>>>>> >>>>>>>> driver_unregister -> >>>>>>>> bus_remove_driver -> >>>>>>>> driver_detach -> >>>>>>>> device_release_driver_internal >>>>>>>> >>>>>>>> device_unregister -> >>>>>>>> device_del -> >>>>>>>> bus_remove_device -> >>>>>>>> device_release_driver -> >>>>>>>> device_release_driver_internal >>>>>>>> >>>>>>>> sysfs: unbind_store -> >>>>>>>> device_release_driver -> >>>>>>>> device_release_driver_internal >>>>>>>> >>>>>>>> The only way I've found to differentiate between these in a cleanup >>>>>>>> action is that device_del() uses the bus notifier to signal >>>>>>>> BUS_NOTIFY_DEL_DEVICE before calling bus_remove_device(). Such a >>>>>>>> notifier could be used to set a drm_device->parent_removed flag. >>>>>>> >>>>>>> Hm, this might upset Greg KH's code taste ... maybe there's a better >>>>>>> way to do this, but best to prototype a patch with this, send it to >>>>>>> him and ask how to :-) >>>>>>> >>>>>> >>>>>> I'll leave this to the one that needs it. The tinydrm drivers doesn't >>>>>> need to touch hw after DRM unregister. >>>>>> >>>>>>>> Why is it necessary to call drm_atomic_helper_shutdown() here? Doesn't >>>>>>>> everything get disabled when userspace closes? It does in my tinydrm >>>>>>>> world :-) >>>>>>> >>>>>>> Iirc fbdev/fbcon can result in leaks ... at least we've had patches >>>>>>> where drivers leaked drm_connector and drm_framebuffer objects, and >>>>>>> they've been fixed by calling drm_atomic_helper_shutdown() in the >>>>>>> unload path. Maybe this is cargo-culting, but it goes way back to >>>>>>> pre-atomic, where drivers called drm_helper_force_disable_all(). >>>>>>> >>>>>>> If you try to move the fbcon to your tinydrm drivers (con2fb is >>>>>>> apparently the cmdline tool you need, never tried it, I only switch >>>>>>> the kernel's console between fbcon and dummycon and back, not what >>>>>>> fbcon drivers itself), then I think you should be able to reproduce. >>>>>>> And maybe you have a better idea how to deal with this all. >>>>>>> >>>>>>> Note also that there's been proposals floating around to only close an >>>>>>> drm_framebuffer, not also remove it (like the current RMFB ioctl >>>>>>> does), with that closing userspace would not necessarily lead to a >>>>>>> full cleanup. >>>>>>> >>>>>>> Another thing (which doesn't apply to drm_simple_display_pipe drivers) >>>>>>> is if you have the display on, but no planes showing (i.e. all black). >>>>>>> Then all the fbs will be cleaned up, but drm_connector will be >>>>>>> leaking. That's a case where you need drm_atomic_helper_shutdown() >>>>>>> even if fbcon/fbdev isn't even enabled. >>>>>>> >>>>>> >>>>>> Ok, this means that I don't need to call drm_atomic_helper_shutdown() in >>>>>> tinydrm. DRM userspace disables the pipe on close and the generic fbdev >>>>>> emulation also releases everything. >>>>>> Even so, maybe I should use devm_drm_mode_config_reset() after all to >>>>>> keep drivers uniform, to avoid confusion: why doesn't he use it? >>>>> >>>>> Hm maybe there is an official way to solve this, pulling in Greg+Rafael. >>>>> >>>>> Super short summary: We want to start using devm actions to clean up drm >>>>> drivers. Here's the problem: >>>>> - For a driver unload/unbind without hotunplug, we want to properly clean >>>>> up the hardware and shut it all down. >>>> >>>> Then do it on probe/disconnect. >>>> >>>>> - But if the device is unplugged already, that's probably not the best >>>>> idea, and we only want to clean up the kernel's resources/allocations. >>>> >>>> Again, probe/disconnect will be called either way. >>>> >>>> But as you note, memory will NOT be freed by the devm stuff if you >>>> manually unbind a driver from a device. >>>> >>>> So don't touch hardware there, it's not going to work :) >>>> >>>>> What's the recommendation here? I see a few options: >>>>> >>>>> - Make sure everything can deal with this properly. Hotunplug can happen >>>>> anytime, so there's a race no matter what. >>>> >>>> Yes. >>>> >>>>> - Check with the device model whether the struct device is disappearing or >>>>> whether we're just dealing with a driver unbind (no idea how to do >>>>> that), and act accordingly. >>>> >>>> You don't know that, sorry. Just do any hardware stuff on disconnect. >>>> Assuming your hardware is still present :) >>>> >>>>> - Fundamental question: Touching the hw from devm actions, is that ok? If >>>>> not, then the pretty nifty plan laid out in this thread wont work. >>>> >>>> Nope, that's not going to work, the device could either be long gone, or >>>> you will not be called due to unbind happening from userspace. >>>> >>>> But really, unbind from userspace is very very rare, it's a developer >>>> thing mostly. Oh and a virtual driver thing, but those people are crazy >>>> :) >>>> >>>>> - Something completely different? >>>> >>>> Do it in disconnect :) >>> >>> Ah, I forgot to mention the important constraint :-) disconnect/unbind >>> should be the following sequence: >>> >>> 1. Unregister all the userspace interfaces (there's a lot of them) and >>> make sure all the pending ioctls are done so that from now on >>> userspace sees lots of -EIO (in case it still has fd open, which is >>> going to be the normal for hotunplug. >>> >>> 2. Shut down hw and all ongoing operations (only relevant for unbind, >>> but needs to be able to cope with sudden hotunplug on top anyway). >>> >>> 3. Clean up the kernel mess and release everything. >>> >>> Probe is exactly the other way round, so would perfectly fit into the >>> devm onion cleanup. See in the commented earlier replies above how >>> that would match in details, but tldr; if we have to do 2. in >>> disconnect, then we also have to do 1. in disconnected, and only doing >>> 3. through devm is almost not worth the bother. But if we could do all >>> three through devm then simple drivers wouldn't even need any >>> disconnect/unbind callback at all. That's our motivation for trying to >>> come up with an answer that's not "do it in disconnect". "do it in >>> disconnect" is how we do it all today already. >>> >>> Yes we're trying to make tiny drivers even smaller, we have enough >>> nowadays that this stuff would be worth it :-) >>> >> >> I think a solution is to say that drivers that want to touch hw on >> disconnect needs to use device_driver->remove to do that. >> >> This is an example driver that doesn't need to touch hw because it's so >> simple that userspace has disabled the pipeline: >> >> static void drm_driver_release(struct drm_device *drm) >> { >> drm_mode_config_cleanup(drm); >> drm_dev_fini(drm); >> kfree(drm); >> } >> >> static struct drm_driver drm_driver = { >> .release = drm_driver_release, >> /* ... */ >> }; >> >> static int driver_probe(struct device *dev) >> { >> struct drm_device *drm; >> int ret; >> >> drm = kzalloc(sizeof(*drm), GFP_KERNEL); >> if (!drm) >> return -ENOMEM; >> >> ret = devm_drm_dev_init(dev, drm, &drm_driver); >> if (ret) { >> kfree(drm); >> return ret; >> } >> >> drm_mode_config_init(drm); >> >> /* Aquire various resources, all managed by devres */ >> >> drm_mode_config_reset(drm); >> >> return devm_drm_dev_register(drm); >> } >> >> struct device_driver driver = { >> .probe = driver_probe, >> }; >> >> >> A driver that wants to touch hardware on disconnect, can look like this: >> >> static void drm_driver_release(struct drm_device *drm) >> { >> drm_mode_config_cleanup(drm); >> drm_dev_fini(drm); >> kfree(drm); >> } >> >> static struct drm_driver drm_driver = { >> .release = drm_driver_release, >> /* ... */ >> }; >> >> static int driver_probe(struct device *dev) >> { >> struct drm_device *drm; >> int ret; >> >> drm = kzalloc(sizeof(*drm), GFP_KERNEL); >> if (!drm) >> return -ENOMEM; >> >> ret = devm_drm_dev_init(dev, drm, &drm_driver); >> if (ret) { >> kfree(drm); >> return ret; >> } >> >> drm_mode_config_init(drm); >> >> /* Aquire various resources, all managed by devres */ >> >> drm_mode_config_reset(drm); >> >> ret = drm_dev_register(drm); >> if (ret) >> return ret; >> >> drm_dev_get(dev); /* If using drm_dev_unplug() */ >> >> dev_set_drvdata(dev, drm); >> >> return 0; >> } >> >> /* This function is called before devres_release_all() */ >> static int driver_remove(struct device *dev) >> { >> struct drm_device *drm = dev_get_drvdata(dev); >> >> drm_dev_unplug(drm); OR drm_dev_unregister(drm); >> drm_atomic_helper_shutdown(drm) >> >> return 0; >> } >> >> struct device_driver driver = { >> .probe = driver_probe, >> .remove = driver_remove, > > That's exactly the pattern I'm trying to avoid, because imo your tiny > driver _also_ should do this. Or we realize that all the current drivers > doing drm_atomic_helper_shutdown are misguided, but I'm not really > understanding why. > I don't know where my head has been, the pipeline on tinydrm isn't disabled on driver module unload. I've so preoccupied with ensuring that device removal is working that I must have had some kind of tunnel vision. The problem is that fbdev is restored on lastclose regardless of it being in use or not. I tried to address it in this patch: drm/fb-helper/generic: Only restore when in use https://patchwork.freedesktop.org/patch/263271/ > Having a devm helper which cannot be used for some drivers due to > fundamental design issues is kinda not great, because it means everyone > will still use it, and shrug the bugs off as "not my problem". Which is > what's happening right now with all the devm_kzalloc we have in drm > drivers that all the get lifetim wrong. But because devm_ is convenient > everyone uses it, so the driver unload of most drivers is full of bugs. Yes, it's starting to look like this devm idea isn't such a great thing after all. And it looks like dropping DRM devm can still give a slim driver: static int driver_probe(struct device *dev) { struct drm_device *drm; int ret; drm = kzalloc(sizeof(*drm), GFP_KERNEL); if (!drm) return -ENOMEM; ret = drm_dev_init(dev, drm, &drm_driver); if (ret) { kfree(drm); return ret; } drm_mode_config_init(drm); /* * Aquire various resources, all managed by devres * or being released in drm_driver->release. * goto err_put on failure */ drm_mode_config_reset(drm); ret = drm_dev_register(drm); if (ret) goto err_put; dev_set_drvdata(dev, drm); return 0; err_put: drm_dev_put(dev); return ret } static int driver_remove(struct device *dev) { struct drm_device *drm = dev_get_drvdata(dev); drm_atomic_helper_shutdown(drm) drm_dev_unplug(drm); return 0; } Noralf. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel