devm actions and hw clenaup (was Re: [PATCH 01/11] drm: Add devm_drm_dev_init/register)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 23, 2019 at 11:54:07AM +0100, Noralf Trønnes wrote:
> 
> 
> Den 22.01.2019 20.30, skrev Daniel Vetter:
> > On Tue, Jan 22, 2019 at 8:07 PM Noralf Trønnes <noralf@xxxxxxxxxxx> wrote:
> >>
> >>
> >>
> >> Den 22.01.2019 10.32, skrev Daniel Vetter:
> >>> On Mon, Jan 21, 2019 at 01:21:46PM +0100, Noralf Trønnes wrote:
> >>>>
> >>>>
> >>>> Den 21.01.2019 10.55, skrev Daniel Vetter:
> >>>>> On Mon, Jan 21, 2019 at 10:10:14AM +0100, Daniel Vetter wrote:
> >>>>>> On Sun, Jan 20, 2019 at 12:43:08PM +0100, Noralf Trønnes wrote:
> >>>>>>> This adds resource managed (devres) versions of drm_dev_init() and
> >>>>>>> drm_dev_register().
> >>>>>>>
> >>>>>>> Also added is devm_drm_dev_register_with_fbdev() which sets up generic
> >>>>>>> fbdev emulation as well.
> >>>>>>>
> >>>>>>> devm_drm_dev_register() isn't exported since there are no users.
> >>>>>>>
> >>>>>>> Signed-off-by: Noralf Trønnes <noralf@xxxxxxxxxxx>
> >>>>>>
> >>>>
> >>>> <snip>
> >>>>
> >>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>>>> index 381581b01d48..12129772be45 100644
> >>>>>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>>>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>>>>> @@ -36,6 +36,7 @@
> >>>>>>>
> >>>>>>>  #include <drm/drm_client.h>
> >>>>>>>  #include <drm/drm_drv.h>
> >>>>>>> +#include <drm/drm_fb_helper.h>
> >>>>>>>  #include <drm/drmP.h>
> >>>>>>>
> >>>>>>>  #include "drm_crtc_internal.h"
> >>>>>>> @@ -871,6 +872,111 @@ void drm_dev_unregister(struct drm_device *dev)
> >>>>>>>  }
> >>>>>>>  EXPORT_SYMBOL(drm_dev_unregister);
> >>>>>>>
> >>>>>>> +static void devm_drm_dev_init_release(void *data)
> >>>>>>> +{
> >>>>>>> + drm_dev_put(data);
> >>>>>
> >>>>> We need drm_dev_unplug() here, or this isn't safe.
> >>>>
> >>>> This function is only used to cover the error path if probe fails before
> >>>> devm_drm_dev_register() is called. devm_drm_dev_register_release() is
> >>>> the one that calls unplug. There are comments about this in the functions.
> >>>
> >>> I think I get a prize for being ignorant and blind :-/
> >>>
> >>>>
> >>>>>
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +/**
> >>>>>>> + * devm_drm_dev_init - Resource managed drm_dev_init()
> >>>>>>> + * @parent: Parent device object
> >>>>>>> + * @dev: DRM device
> >>>>>>> + * @driver: DRM driver
> >>>>>>> + *
> >>>>>>> + * Managed drm_dev_init(). The DRM device initialized with this function is
> >>>>>>> + * automatically released on driver detach. You must supply a
> >>>>>
> >>>>> I think a bit more clarity here would be good:
> >>>>>
> >>>>> "... automatically released on driver unbind by callind drm_dev_unplug()."
> >>>>>
> >>>>>>> + * &drm_driver.release callback to control the finalization explicitly.
> >>>>>
> >>>>> I think a loud warning for these is in order:
> >>>>>
> >>>>> "WARNING:
> >>>>>
> >>>>> "In generally it is unsafe to use devm functions for drm structures
> >>>>> because the lifetimes of &drm_device and the underlying &device do not
> >>>>> match. This here works because it doesn't immediately free anything, but
> >>>>> only calls drm_dev_unplug(), which internally decrements the &drm_device
> >>>>> refcount through drm_dev_put().
> >>>>>
> >>>>> "All other drm structures must still be explicitly released in the
> >>>>> &drm_driver.release callback."
> >>>>>
> >>>>> While thinking about this I just realized that with this design we have no
> >>>>> good place to call drm_atomic_helper_shutdown(). Which we need to, or all
> >>>>> kinds of things will leak badly (connectors, fb, ...), but there's no
> >>>>> place to call it:
> >>>>> - unbind is too early, since we haven't yet called drm_dev_unplug, and the
> >>>>>   drm_dev_unregister in there must be called _before_ we start to shut
> >>>>>   down anything.
> >>>>> - drm_driver.release is way too late.
> >>>>>
> >>>>> Ofc for a real hotunplug there's no point in shutting down the hw (it's
> >>>>> already gone), but for a driver unload/unbind it would be nice if this
> >>>>> happens automatically and in the right order.
> >>>>>
> >>>>> So not sure what to do here really.
> >>>>
> >>>> How about this change: (it breaks the rule of pulling helpers into the
> >>>> core, so maybe we should put the devm_ functions into the simple KMS
> >>>> helper instead?)
> >>>
> >>> Yeah smells a bit much like midlayer ... What would work is having a pile
> >>> more devm_ helper functions, so that we onion-unwrap everything correctly,
> >>> and in the right order. So:
> >>>
> >>> - devm_drm_dev_init (always does a drm_dev_put())
> >>>
> >>> - devm_drm_poll_enable (shuts down the poll helper with a devm action)
> >>>
> >>> - devm_drm_mode_config_reset (does an atomic_helper_shutdown() as it's cleanup action)
> >>>
> >>> - devm_drm_dev_register (grabs an additional drm_dev_get() reference so it
> >>>   can call drm_dev_unplug() unconditionally).
> >>>
> >>
> >> Beautiful! I really like this, it's very flexible.
> >>
> >> Where should devm_drm_mode_config_reset() live? It will pull in the
> >> atomic helper...
> > 
> > I think a new drm_devm.c helper would be nice for all this stuff.
> > Especially since you can't freely mix devm-based setup/cleanup with
> > normal cleanup I think it'd be good to have it all together in one
> > place. And perhaps even a code example in the DOC: overview.
> > 
> >>> We'd need to make sure some of the cleanup actions dtrt when the device is
> >>> gone, but I think we can achieve that by liberally sprinkling
> >>> drm_dev_enter/exit over them, e.g. the the cleanup action for
> >>> drm_mode_config_reset would be:
> >>>
> >>> {
> >>>       if (drm_dev_enter())
> >>>               return;
> >>>
> >>>       drm_atomic_helper_shutdown();
> >>>
> >>>       drm_dev_exit();
> >>> }
> >>>
> >>
> >> drm_dev_enter() can only be used to check whether the drm_device is
> >> registered or not, it doesn't say anything about the state of the parent
> >> device.
> >>
> >> All we know is that the device is being unbound from the driver, we
> >> don't know if it's the device that's being removed or if it's the driver
> >> that's unregistered.
> > 
> > You're right, both paths will have called drm_dev_unplug by then.
> > Silly me. I really liked my idea :-)
> > 
> >> I have looked at the various call chains:
> >>
> >> driver_unregister ->
> >>     bus_remove_driver ->
> >>         driver_detach ->
> >>             device_release_driver_internal
> >>
> >> device_unregister ->
> >>     device_del ->
> >>         bus_remove_device ->
> >>             device_release_driver ->
> >>                 device_release_driver_internal
> >>
> >> sysfs: unbind_store ->
> >>     device_release_driver ->
> >>         device_release_driver_internal
> >>
> >> The only way I've found to differentiate between these in a cleanup
> >> action is that device_del() uses the bus notifier to signal
> >> BUS_NOTIFY_DEL_DEVICE before calling bus_remove_device(). Such a
> >> notifier could be used to set a drm_device->parent_removed flag.
> > 
> > Hm, this might upset Greg KH's code taste ... maybe there's a better
> > way to do this, but best to prototype a patch with this, send it to
> > him and ask how to :-)
> > 
> 
> I'll leave this to the one that needs it. The tinydrm drivers doesn't
> need to touch hw after DRM unregister.
> 
> >> Why is it necessary to call drm_atomic_helper_shutdown() here? Doesn't
> >> everything get disabled when userspace closes? It does in my tinydrm
> >> world :-)
> > 
> > Iirc fbdev/fbcon can result in leaks ... at least we've had patches
> > where drivers leaked drm_connector and drm_framebuffer objects, and
> > they've been fixed by calling drm_atomic_helper_shutdown() in the
> > unload path. Maybe this is cargo-culting, but it goes way back to
> > pre-atomic, where drivers called drm_helper_force_disable_all().
> > 
> > If you try to move the fbcon to your tinydrm drivers (con2fb is
> > apparently the cmdline tool you need, never tried it, I only switch
> > the kernel's console between fbcon and dummycon and back, not what
> > fbcon drivers itself), then I think you should be able to reproduce.
> > And maybe you have a better idea how to deal with this all.
> > 
> > Note also that there's been proposals floating around to only close an
> > drm_framebuffer, not also remove it (like the current RMFB ioctl
> > does), with that closing userspace would not necessarily lead to a
> > full cleanup.
> > 
> > Another thing (which doesn't apply to drm_simple_display_pipe drivers)
> > is if you have the display on, but no planes showing (i.e. all black).
> > Then all the fbs will be cleaned up, but drm_connector will be
> > leaking. That's a case where you need drm_atomic_helper_shutdown()
> > even if fbcon/fbdev isn't even enabled.
> > 
> 
> Ok, this means that I don't need to call drm_atomic_helper_shutdown() in
> tinydrm. DRM userspace disables the pipe on close and the generic fbdev
> emulation also releases everything.
> Even so, maybe I should use devm_drm_mode_config_reset() after all to
> keep drivers uniform, to avoid confusion: why doesn't he use it?

Hm maybe there is an official way to solve this, pulling in Greg+Rafael.

Super short summary: We want to start using devm actions to clean up drm
drivers. Here's the problem:
- For a driver unload/unbind without hotunplug, we want to properly clean
  up the hardware and shut it all down.

- But if the device is unplugged already, that's probably not the best
  idea, and we only want to clean up the kernel's resources/allocations.

What's the recommendation here? I see a few options:

- Make sure everything can deal with this properly. Hotunplug can happen
  anytime, so there's a race no matter what.

- Check with the device model whether the struct device is disappearing or
  whether we're just dealing with a driver unbind (no idea how to do
  that), and act accordingly.

- Fundamental question: Touching the hw from devm actions, is that ok? If
  not, then the pretty nifty plan laid out in this thread wont work.

- Something completely different?

Thanks, Daniel

> 
> Noralf.
> 
> > Cheers, Daniel
> > 
> > 
> >>> This would be analog to your shutdown parameter below.
> >>>
> >>> Essentially I think we can only do the drm_dev_unplug autocleanup in a
> >>> given driver from devm_drm_dev_register iff all the cleanup actions have
> >>> been devm-ified, and there's nothing left in it's unbind callback. Because
> >>> if anything is left in its unbind callback that's a bug, since
> >>> drm_dev_unregister() (called through drm_dev_unplug) is the very first (or
> >>> at least one of the very first, before we start cleanup up) functions that
> >>> need to be called.
> >>> -Daniel
> >>>
> >>>>
> >>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>> index 12129772be45..7ed9550baff6 100644
> >>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>> @@ -34,7 +34,9 @@
> >>>>  #include <linux/slab.h>
> >>>>  #include <linux/srcu.h>
> >>>>
> >>>> +#include <drm/drm_atomic_helper.h>
> >>>>  #include <drm/drm_client.h>
> >>>> +#include <drm/drm_crtc_helper.h>
> >>>>  #include <drm/drm_drv.h>
> >>>>  #include <drm/drm_fb_helper.h>
> >>>>  #include <drm/drmP.h>
> >>>> @@ -355,17 +357,7 @@ void drm_dev_exit(int idx)
> >>>>  }
> >>>>  EXPORT_SYMBOL(drm_dev_exit);
> >>>>
> >>>> -/**
> >>>> - * drm_dev_unplug - unplug a DRM device
> >>>> - * @dev: DRM device
> >>>> - *
> >>>> - * This unplugs a hotpluggable DRM device, which makes it inaccessible to
> >>>> - * userspace operations. Entry-points can use drm_dev_enter() and
> >>>> - * drm_dev_exit() to protect device resources in a race free manner. This
> >>>> - * essentially unregisters the device like drm_dev_unregister(), but can be
> >>>> - * called while there are still open users of @dev.
> >>>> - */
> >>>> -void drm_dev_unplug(struct drm_device *dev)
> >>>> +static void __drm_dev_unplug(struct drm_device *dev, bool shutdown)
> >>>>  {
> >>>>         /*
> >>>>          * After synchronizing any critical read section is guaranteed
> >>>> to see
> >>>> @@ -378,11 +370,32 @@ void drm_dev_unplug(struct drm_device *dev)
> >>>>
> >>>>         drm_dev_unregister(dev);
> >>>>
> >>>> +       if (shutdown)
> >>>> +               drm_kms_helper_poll_fini(dev);
> >>>> +
> >>>>         mutex_lock(&drm_global_mutex);
> >>>> -       if (dev->open_count == 0)
> >>>> +       if (dev->open_count == 0) {
> >>>> +               if (shutdown)
> >>>> +                       drm_atomic_helper_shutdown(dev);
> >>>>                 drm_dev_put(dev);
> >>>> +       }
> >>>>         mutex_unlock(&drm_global_mutex);
> >>>>  }
> >>>> +
> >>>> +/**
> >>>> + * drm_dev_unplug - unplug a DRM device
> >>>> + * @dev: DRM device
> >>>> + *
> >>>> + * This unplugs a hotpluggable DRM device, which makes it inaccessible to
> >>>> + * userspace operations. Entry-points can use drm_dev_enter() and
> >>>> + * drm_dev_exit() to protect device resources in a race free manner. This
> >>>> + * essentially unregisters the device like drm_dev_unregister(), but can be
> >>>> + * called while there are still open users of @dev.
> >>>> + */
> >>>> +void drm_dev_unplug(struct drm_device *dev)
> >>>> +{
> >>>> +       __drm_dev_unplug(dev, false);
> >>>> +}
> >>>>  EXPORT_SYMBOL(drm_dev_unplug);
> >>>>
> >>>>  /*
> >>>> @@ -920,7 +933,7 @@ EXPORT_SYMBOL(devm_drm_dev_init);
> >>>>
> >>>>  static void devm_drm_dev_register_release(void *data)
> >>>>  {
> >>>> -       drm_dev_unplug(data);
> >>>> +       __drm_dev_unplug(data, true);
> >>>>  }
> >>>>
> >>>>  static int devm_drm_dev_register(struct drm_device *dev)
> >>>>
> >>>>
> >>>> I realised that we should take a ref on the parent device so it can be
> >>>> accessed by the DRM_DEV_ functions after unplug.
> >>>>
> >>>>
> >>>> Noralf.
> >>>>
> >>>>
> >>>>>
> >>>>>>> + *
> >>>>>>> + * Note: This function must be used together with
> >>>>>>> + * devm_drm_dev_register_with_fbdev().
> >>>>>>> + *
> >>>>>>> + * RETURNS:
> >>>>>>> + * 0 on success, or error code on failure.
> >>>>>>> + */
> >>>>>>> +int devm_drm_dev_init(struct device *parent,
> >>>>>>> +               struct drm_device *dev,
> >>>>>>> +               struct drm_driver *driver)
> >>>>>>> +{
> >>>>>>> + int ret;
> >>>>>>> +
> >>>>>>> + if (WARN_ON(!parent || !driver->release))
> >>>>>>> +         return -EINVAL;
> >>>>>>> +
> >>>>>>> + ret = drm_dev_init(dev, driver, parent);
> >>>>>>> + if (ret)
> >>>>>>> +         return ret;
> >>>>>>> +
> >>>>>>> + /*
> >>>>>>> +  * This is a temporary release action that is used if probing fails
> >>>>>>> +  * before devm_drm_dev_register() is called.
> >>>>>>> +  */
> >>>>>>> + ret = devm_add_action(parent, devm_drm_dev_init_release, dev);
> >>>>>>> + if (ret)
> >>>>>>> +         devm_drm_dev_init_release(dev);
> >>>>>>> +
> >>>>>>> + return ret;
> >>>>>>> +}
> >>>>>>> +EXPORT_SYMBOL(devm_drm_dev_init);
> >>>>>>> +
> >>>>>>> +static void devm_drm_dev_register_release(void *data)
> >>>>>>> +{
> >>>>>>> + drm_dev_unplug(data);
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +static int devm_drm_dev_register(struct drm_device *dev)
> >>>>>>> +{
> >>>>>>> + int ret;
> >>>>>>> +
> >>>>>>> + ret = drm_dev_register(dev, 0);
> >>>>>>> + if (ret)
> >>>>>>> +         return ret;
> >>>>>>> +
> >>>>>>> + /*
> >>>>>>> +  * This has now served it's purpose, remove it to not mess up ref
> >>>>>>> +  * counting.
> >>>>>>> +  */
> >>>>>>> + devm_remove_action(dev->dev, devm_drm_dev_init_release, dev);
> >>>>>>> +
> >>>>>>> + ret = devm_add_action(dev->dev, devm_drm_dev_register_release, dev);
> >>>>>>> + if (ret)
> >>>>>>> +         devm_drm_dev_register_release(dev);
> >>>>>>> +
> >>>>>>> + return ret;
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +/**
> >>>>>>> + * devm_drm_dev_register_with_fbdev - Resource managed drm_dev_register()
> >>>>>>> + *                                    including generic fbdev emulation
> >>>>>>> + * @dev: DRM device to register
> >>>>>>> + * @fbdev_bpp: Preferred bits per pixel for fbdev (optional)
> >>>>>>> + *
> >>>>>>> + * Managed drm_dev_register() that also calls drm_fbdev_generic_setup().
> >>>>>>> + * The DRM device registered with this function is automatically unregistered on
> >>>>>>> + * driver detach using drm_dev_unplug().
> >>>>>>> + *
> >>>>>>> + * Note: This function must be used together with devm_drm_dev_init().
> >>>>>>> + *
> >>>>>>> + * For testing driver detach can be triggered manually by writing to the driver
> >>>>>>> + * 'unbind' file.
> >>>>>>> + *
> >>>>>>> + * RETURNS:
> >>>>>>> + * 0 on success, negative error code on failure.
> >>>>>>> + */
> >>>>>>> +int devm_drm_dev_register_with_fbdev(struct drm_device *dev,
> >>>>>>> +                              unsigned int fbdev_bpp)
> >>>>>>> +{
> >>>>>>> + int ret;
> >>>>>>> +
> >>>>>>> + ret = devm_drm_dev_register(dev);
> >>>>>>> + if (ret)
> >>>>>>> +         return ret;
> >>>>>>> +
> >>>>>>> + drm_fbdev_generic_setup(dev, fbdev_bpp);
> >>>>>>> +
> >>>>>>> + return 0;
> >>>>>>> +}
> >>>>>>> +EXPORT_SYMBOL(devm_drm_dev_register_with_fbdev);
> >>>>>>> +
> >>>>>>>  /**
> >>>>>>>   * drm_dev_set_unique - Set the unique name of a DRM device
> >>>>>>>   * @dev: device of which to set the unique name
> >>>>>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> >>>>>>> index 35af23f5fa0d..c3f0477f2e7f 100644
> >>>>>>> --- a/include/drm/drm_drv.h
> >>>>>>> +++ b/include/drm/drm_drv.h
> >>>>>>> @@ -628,6 +628,12 @@ struct drm_device *drm_dev_alloc(struct drm_driver *driver,
> >>>>>>>  int drm_dev_register(struct drm_device *dev, unsigned long flags);
> >>>>>>>  void drm_dev_unregister(struct drm_device *dev);
> >>>>>>>
> >>>>>>> +int devm_drm_dev_init(struct device *parent,
> >>>>>>> +               struct drm_device *dev,
> >>>>>>> +               struct drm_driver *driver);
> >>>>>>> +int devm_drm_dev_register_with_fbdev(struct drm_device *dev,
> >>>>>>> +                              unsigned int fbdev_bpp);
> >>>>>>> +
> >>>>>>>  void drm_dev_get(struct drm_device *dev);
> >>>>>>>  void drm_dev_put(struct drm_device *dev);
> >>>>>>>  void drm_put_dev(struct drm_device *dev);
> >>>>>>> --
> >>>>>>> 2.20.1
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> dri-devel mailing list
> >>>>>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> >>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>>>>>
> >>>>>> --
> >>>>>> Daniel Vetter
> >>>>>> Software Engineer, Intel Corporation
> >>>>>> http://blog.ffwll.ch
> >>>>>
> >>>
> > 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux