Re: [PATCH 6/6] drm/tinydrm: Support device unplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniel and Noralf,

On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote:
> On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes <noralf@xxxxxxxxxxx> wrote:
> > Den 28.08.2017 23.56, skrev Daniel Vetter:
> >> On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote:
> >>> Support device unplugging to make tinydrm suitable for USB devices.
> >>> 
> >>> Cc: David Lechner <david@xxxxxxxxxxxxxx>
> >>> Signed-off-by: Noralf Trønnes <noralf@xxxxxxxxxxx>
> >>> ---
> >>> 
> >>> drivers/gpu/drm/tinydrm/core/tinydrm-core.c | 69 ++++++++++++++++++---
> >>> drivers/gpu/drm/tinydrm/mi0283qt.c          |  4 ++
> >>> drivers/gpu/drm/tinydrm/mipi-dbi.c          |  5 ++-
> >>> drivers/gpu/drm/tinydrm/repaper.c           |  9 +++-
> >>> drivers/gpu/drm/tinydrm/st7586.c            |  9 +++-
> >>> include/drm/tinydrm/tinydrm.h               |  5 +++
> >>> 6 files changed, 87 insertions(+), 14 deletions(-)
> >>> 
> >>> diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
> >>> b/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
> >>> index f11f4cd..3ccbcc5 100644
> >>> --- a/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
> >>> +++ b/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
> >>> @@ -32,6 +32,29 @@
> >>>   * The driver allocates &tinydrm_device, initializes it using
> >>>   * devm_tinydrm_init(), sets up the pipeline using
> >>> tinydrm_display_pipe_init()
> >>>   * and registers the DRM device using devm_tinydrm_register().
> >>> + *
> >>> + * Device unplug
> >>> + * -------------
> >>> + *
> >>> + * tinydrm supports device unplugging when there's still open DRM or
> >>> fbdev file
> >>> + * handles.
> >>> + *
> >>> + * There are 2 ways for driver-device unbinding to happen:
> >>> + *
> >>> + * - The driver module is unloaded causing the driver to be
> >>> unregistered.
> >>> + *   This can't happen as long as there's open file handles because a
> >>> reference
> >>> + *   is taken on the module.
> >> 
> >> Aside: you can do that, but then it works like a hw unplug, through the
> >> unbind property in sysfs.
> > 
> > The module is still pinned isn't it?
> 
> Yup, but only the code is pinned, not any of the datastructures like
> drm_device.
>
> > I can add sysfs unbind as a third way of triggering unbind:
> >  * - The sysfs driver _unbind_ file can be used to unbind the driver form
> > the
> >  *   device. This can happen any time.
> >  *

Sounds good to me.

> >>> + *
> >>> + * - The device is removed (USB, Device Tree overlay).
> >>> + *   This can happen at any time.
> >>> + *
> >>> + * The driver needs to protect device resources from access after the
> >>> device is
> >>> + * gone. This is done checking drm_dev_is_unplugged(), typically in
> >>> + * &drm_framebuffer_funcs.dirty, &drm_simple_display_pipe_funcs.enable
> >>> and
> >>> + * \.disable. Resources that doesn't face userspace and is only used

s/doesn't/don't/
s/and is/and are/

> >>> with the
> >>> + * device can be setup using devm\_ functions, but &tinydrm_device must
> >>> be
> >>> + * allocated using plain kzalloc() since it's lifetime can exceed that
> >>> of the

s/it's/its/

> >>> + * device. tinydrm_release() will free the structure.
> >> 
> >> So here's a bit a dragon: There's no prevention of is_unplugged racing
> >> against a drm_dev_unplug(). There's been various attempts to fixing this,
> >> but they're all somewhat ugly.
> >> 
> >> Either way, that's a bug in the drm core :-)
> >> 
> >>>    */
> >>>    
> >>>     /**
> >>> @@ -138,6 +161,29 @@ static const struct drm_mode_config_funcs
> >>> tinydrm_mode_config_funcs = {
> >>>         .atomic_commit = drm_atomic_helper_commit,
> >>>   };
> >>> +/**
> >>> + * tinydrm_release - DRM driver release helper
> >>> + * @drm: DRM device
> >>> + *
> >>> + * This function cleans up and finalizes &drm_device and frees
> >>> &tinydrm_device.
> >>> + *
> >>> + * Drivers must use this as their &drm_driver->release callback.
> >>> + */
> >>> +void tinydrm_release(struct drm_device *drm)
> >>> +{
> >>> +       struct tinydrm_device *tdev = drm_to_tinydrm(drm);
> >>> +
> >>> +       DRM_DEBUG_DRIVER("\n");
> >>> +
> >>> +       drm_mode_config_cleanup(drm);
> >>> +       drm_dev_fini(drm);
> >>> +
> >>> +       mutex_destroy(&tdev->dirty_lock);
> >>> +       kfree(tdev->fbdev_cma);
> >>> +       kfree(tdev);
> >>> +}
> >>> +EXPORT_SYMBOL(tinydrm_release);
> >>> +
> >>>   static int tinydrm_init(struct device *parent, struct tinydrm_device
> >>> *tdev,
> >>>                         const struct drm_framebuffer_funcs *fb_funcs,
> >>>                         struct drm_driver *driver)
> >>> @@ -160,8 +206,6 @@ static int tinydrm_init(struct device *parent,
> >>> struct tinydrm_device *tdev,
> >>>   static void tinydrm_fini(struct tinydrm_device *tdev)
> >>>   {
> >>> -       drm_mode_config_cleanup(&tdev->drm);
> >>> -       mutex_destroy(&tdev->dirty_lock);
> >>>         drm_dev_unref(&tdev->drm);
> >>>   }
> >>> @@ -178,8 +222,8 @@ static void devm_tinydrm_release(void *data)
> >>>   * @driver: DRM driver
> >>>   *
> >>>   * This function initializes @tdev, the underlying DRM device and it's

While at it, s/it's/its/

> >>> - * mode_config. Resources will be automatically freed on driver detach
> >>> (devres)
> >>> - * using drm_mode_config_cleanup() and drm_dev_unref().
> >>> + * mode_config. drm_dev_unref() is called on driver detach (devres) and
> >>> when
> >>> + * all refs are dropped, tinydrm_release() is called.
> >>>   *
> >>>   * Returns:
> >>>   * Zero on success, negative error code on failure.
> >>> @@ -226,14 +270,17 @@ static int tinydrm_register(struct tinydrm_device
> >>> *tdev)
> >>> 
> >>>   static void tinydrm_unregister(struct tinydrm_device *tdev)
> >>>   {
> >>> -       struct drm_fbdev_cma *fbdev_cma = tdev->fbdev_cma;
> >>> -
> >>>         drm_atomic_helper_shutdown(&tdev->drm);
> >>> 
> >>> -       /* don't restore fbdev in lastclose, keep pipeline disabled */
> >>> -       tdev->fbdev_cma = NULL;
> >>> -       drm_dev_unregister(&tdev->drm);
> >>> -       if (fbdev_cma)
> >>> -               drm_fbdev_cma_fini(fbdev_cma);
> >>> +
> >>> +       /* Get a ref that will be put in tinydrm_fini() */
> >>> +       drm_dev_ref(&tdev->drm);
> >> 
> >> Why do we need that private ref? Grabbing references in unregister code
> >> looks like a recipe for leaks ...
> > 
> > Yeah, it's better to take the second ref in devm_tinydrm_register().
> > 
> > The reason I need 2 refs is because tinydrm is set up with 2 devm_
> > functions. This way I can leave all error path cleanup in the hands of
> > devres.
> > 
> > static int driver_probe(...)
> > {
> >     ret = devm_tinydrm_init(...);
> >     if (ret)
> >         return ret;
> >     
> >     ret = tinydrm_display_pipe_init(...);
> >     if (ret)
> >         return ret;
> >     
> >     drm_mode_config_reset(...);
> >     
> >     return devm_tinydrm_register(...);
> > }
> 
> I don't think this works. What's worse, devm has a high chance of
> freeing stuff at the wrong time since it's tied to the
> drm_device->dev, and not to drm_device.
> 
> I'm not sure what a better solution is, but when discussing all this
> with Laurent we agreed that in general, devm_ needs to be considered
> harmful. Looks simple, very easy to create broken code that doesn't
> clean up properly on unplug. As much as leaks are bad, freeing before
> all users are gone is worse. devm_ "fixes" the former and heavily
> encourages the latter.

I totally agree. A crash at unbind time is worse than a memory leak given that 
the bind/unbind cycles are unfrequent. Of course we need to aim for fixing 
both problems :-)

> In your case I think devm_tinydrm_register is correct (but a bit
> strange).

Yes, I don't think an explicit tinydrm_unregister() call in the driver's 
.remove() handler would be so difficult.

> devm_tinydrm_init otoh looks like all the code it has should be moved into
> the drm_driver->release callback. You can't destroy the dirty_lock while a
> thread might be holding it right at that moment.

Agreed too.

Furthermore the devm_kzalloc() in tinydrm_display_pipe_init() seems suspicious 
to me. It should be easy to replace it with kzalloc(), the mode can be freed 
in tinydrm_connector_destroy().

(And on a side note the direct mode copy without calling drm_mode_duplicate() 
in that function is also suspicious, although it might not be an issue in 
practice)

> >>> +
> >>> +       drm_fbdev_cma_dev_unplug(tdev->fbdev_cma);
> >>> +       drm_dev_unplug(&tdev->drm);
> >>> +
> >>> +       /* Make sure framebuffer flushing is done */
> >>> +       mutex_lock(&tdev->dirty_lock);
> >>> +       mutex_unlock(&tdev->dirty_lock);
> >> 
> >> Is this really needed? Or, doesn't it just paper over a driver bug you
> >> have already anyway, since native kms userspace can directly call
> >> fb->funcs->dirty too, and you already protect against that.
> >> 
> >> This definitely looks like the fbdev helper is leaking implementation
> >> details to callers where it shouldn't do that.
> > 
> > Flushing can happen while drm_dev_unplug() is called, and when we leave
> > this function the device facing resources controlled by devres will be
> > removed. Thus I have to make sure any such flushing is done before
> > leaving so the next flush is stopped by the drm_dev_is_unplugged() check.
> > I don't see any other way of ensuring that.
> > 
> > I see now that I should move the call to drm_atomic_helper_shutdown()
> > after drm_dev_unplug() to properly protect the pipe .enable/.disable
> > callbacks.
> 
> Hm, calling _shutdown when the hw is gone already won't end well.
> Fundamentally this race exists for all use-cases, and I'm somewhat
> leaning towards plugging it in the core.
> 
> The general solution probably involves something that smells a lot
> like srcu, i.e. at every possible entry point into a drm driver
> (ioctl, fbdev, dma-buf sharing, everything really) we take that
> super-cheap read-side look, and drop it when we leave.

That's similar to what we plan to do in V4L2. The idea is to set a device 
removed flag at the beginning of the .remove() handler and wait for all 
pending operations to complete. The core will reject any new operation when 
the flag is set. To wait for completion, every entry point would increase a 
use count, and decrease it on exit. When the use count is decreased to 0 
waiters will be woken up. This should solve the unplug/user race.

> Then on unplug we call synchroize_srcu, which essentially does what
> your lock grab&drop trick does. Just much less overhead on the read
> side (we can sprinkle this over everything without worrying about
> wasting cpu time), and a bit clearer in the intention on the _unplug
> side.

Looks like srcu already implements what we need. I'm not sure we need an RCU-
based solution though, it might be too complex.

> But yeah, another thing that'll be a pile of work to fix properly. I'd
> leave it at a FIXME comment in drm_dev_unplug() until we get to it.
> And for now not worry about all the possible races. One thing at a
> time and all that.
> 
> >>>   }
> >>>   
> >>>   static void devm_tinydrm_register_release(void *data)
> >>> diff --git a/drivers/gpu/drm/tinydrm/mi0283qt.c
> >>> b/drivers/gpu/drm/tinydrm/mi0283qt.c
> >>> index 2465489..84ab8d1 100644
> >>> --- a/drivers/gpu/drm/tinydrm/mi0283qt.c
> >>> +++ b/drivers/gpu/drm/tinydrm/mi0283qt.c
> >>> @@ -31,6 +31,9 @@ static void mi0283qt_enable(struct
> >>> drm_simple_display_pipe *pipe,
> >>> 
> >>>         DRM_DEBUG_KMS("\n");
> >>>   
> >>> +       if (drm_dev_is_unplugged(&tdev->drm))
> >>> +               return;
> >>> +
> >>>         ret = regulator_enable(mipi->regulator);
> >>>         if (ret) {
> >>>                 dev_err(dev, "Failed to enable regulator (%d)\n", ret);
> >>> @@ -133,6 +136,7 @@ static struct drm_driver mi0283qt_driver = {
> >>>                                   DRIVER_ATOMIC,
> >>>         .fops                   = &mi0283qt_fops,
> >>>         TINYDRM_GEM_DRIVER_OPS,
> >>> +       .release                = tinydrm_release,

Should this be included in the TINYDRM_GEM_DRIVER_OPS macro ?

> >>>         .lastclose              = tinydrm_lastclose,
> >>>         .debugfs_init           = mipi_dbi_debugfs_init,
> >>>         .name                   = "mi0283qt",

[snip]

-- 
Regards,

Laurent Pinchart

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux