Re: devm actions and hw clenaup (was Re: [PATCH 01/11] drm: Add devm_drm_dev_init/register)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 24, 2019 at 6:46 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Jan 24, 2019 at 11:43:12AM +0100, Daniel Vetter wrote:
> > On Wed, Jan 23, 2019 at 11:54:07AM +0100, Noralf Trønnes wrote:
> > >
> > >
> > > Den 22.01.2019 20.30, skrev Daniel Vetter:
> > > > On Tue, Jan 22, 2019 at 8:07 PM Noralf Trønnes <noralf@xxxxxxxxxxx> wrote:
> > > >>
> > > >>
> > > >>
> > > >> Den 22.01.2019 10.32, skrev Daniel Vetter:
> > > >>> On Mon, Jan 21, 2019 at 01:21:46PM +0100, Noralf Trønnes wrote:
> > > >>>>
> > > >>>>
> > > >>>> Den 21.01.2019 10.55, skrev Daniel Vetter:
> > > >>>>> On Mon, Jan 21, 2019 at 10:10:14AM +0100, Daniel Vetter wrote:
> > > >>>>>> On Sun, Jan 20, 2019 at 12:43:08PM +0100, Noralf Trønnes wrote:
> > > >>>>>>> This adds resource managed (devres) versions of drm_dev_init() and
> > > >>>>>>> drm_dev_register().
> > > >>>>>>>
> > > >>>>>>> Also added is devm_drm_dev_register_with_fbdev() which sets up generic
> > > >>>>>>> fbdev emulation as well.
> > > >>>>>>>
> > > >>>>>>> devm_drm_dev_register() isn't exported since there are no users.
> > > >>>>>>>
> > > >>>>>>> Signed-off-by: Noralf Trønnes <noralf@xxxxxxxxxxx>
> > > >>>>>>
> > > >>>>
> > > >>>> <snip>
> > > >>>>
> > > >>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > >>>>>>> index 381581b01d48..12129772be45 100644
> > > >>>>>>> --- a/drivers/gpu/drm/drm_drv.c
> > > >>>>>>> +++ b/drivers/gpu/drm/drm_drv.c
> > > >>>>>>> @@ -36,6 +36,7 @@
> > > >>>>>>>
> > > >>>>>>>  #include <drm/drm_client.h>
> > > >>>>>>>  #include <drm/drm_drv.h>
> > > >>>>>>> +#include <drm/drm_fb_helper.h>
> > > >>>>>>>  #include <drm/drmP.h>
> > > >>>>>>>
> > > >>>>>>>  #include "drm_crtc_internal.h"
> > > >>>>>>> @@ -871,6 +872,111 @@ void drm_dev_unregister(struct drm_device *dev)
> > > >>>>>>>  }
> > > >>>>>>>  EXPORT_SYMBOL(drm_dev_unregister);
> > > >>>>>>>
> > > >>>>>>> +static void devm_drm_dev_init_release(void *data)
> > > >>>>>>> +{
> > > >>>>>>> + drm_dev_put(data);
> > > >>>>>
> > > >>>>> We need drm_dev_unplug() here, or this isn't safe.
> > > >>>>
> > > >>>> This function is only used to cover the error path if probe fails before
> > > >>>> devm_drm_dev_register() is called. devm_drm_dev_register_release() is
> > > >>>> the one that calls unplug. There are comments about this in the functions.
> > > >>>
> > > >>> I think I get a prize for being ignorant and blind :-/
> > > >>>
> > > >>>>
> > > >>>>>
> > > >>>>>>> +}
> > > >>>>>>> +
> > > >>>>>>> +/**
> > > >>>>>>> + * devm_drm_dev_init - Resource managed drm_dev_init()
> > > >>>>>>> + * @parent: Parent device object
> > > >>>>>>> + * @dev: DRM device
> > > >>>>>>> + * @driver: DRM driver
> > > >>>>>>> + *
> > > >>>>>>> + * Managed drm_dev_init(). The DRM device initialized with this function is
> > > >>>>>>> + * automatically released on driver detach. You must supply a
> > > >>>>>
> > > >>>>> I think a bit more clarity here would be good:
> > > >>>>>
> > > >>>>> "... automatically released on driver unbind by callind drm_dev_unplug()."
> > > >>>>>
> > > >>>>>>> + * &drm_driver.release callback to control the finalization explicitly.
> > > >>>>>
> > > >>>>> I think a loud warning for these is in order:
> > > >>>>>
> > > >>>>> "WARNING:
> > > >>>>>
> > > >>>>> "In generally it is unsafe to use devm functions for drm structures
> > > >>>>> because the lifetimes of &drm_device and the underlying &device do not
> > > >>>>> match. This here works because it doesn't immediately free anything, but
> > > >>>>> only calls drm_dev_unplug(), which internally decrements the &drm_device
> > > >>>>> refcount through drm_dev_put().
> > > >>>>>
> > > >>>>> "All other drm structures must still be explicitly released in the
> > > >>>>> &drm_driver.release callback."
> > > >>>>>
> > > >>>>> While thinking about this I just realized that with this design we have no
> > > >>>>> good place to call drm_atomic_helper_shutdown(). Which we need to, or all
> > > >>>>> kinds of things will leak badly (connectors, fb, ...), but there's no
> > > >>>>> place to call it:
> > > >>>>> - unbind is too early, since we haven't yet called drm_dev_unplug, and the
> > > >>>>>   drm_dev_unregister in there must be called _before_ we start to shut
> > > >>>>>   down anything.
> > > >>>>> - drm_driver.release is way too late.
> > > >>>>>
> > > >>>>> Ofc for a real hotunplug there's no point in shutting down the hw (it's
> > > >>>>> already gone), but for a driver unload/unbind it would be nice if this
> > > >>>>> happens automatically and in the right order.
> > > >>>>>
> > > >>>>> So not sure what to do here really.
> > > >>>>
> > > >>>> How about this change: (it breaks the rule of pulling helpers into the
> > > >>>> core, so maybe we should put the devm_ functions into the simple KMS
> > > >>>> helper instead?)
> > > >>>
> > > >>> Yeah smells a bit much like midlayer ... What would work is having a pile
> > > >>> more devm_ helper functions, so that we onion-unwrap everything correctly,
> > > >>> and in the right order. So:
> > > >>>
> > > >>> - devm_drm_dev_init (always does a drm_dev_put())
> > > >>>
> > > >>> - devm_drm_poll_enable (shuts down the poll helper with a devm action)
> > > >>>
> > > >>> - devm_drm_mode_config_reset (does an atomic_helper_shutdown() as it's cleanup action)
> > > >>>
> > > >>> - devm_drm_dev_register (grabs an additional drm_dev_get() reference so it
> > > >>>   can call drm_dev_unplug() unconditionally).
> > > >>>
> > > >>
> > > >> Beautiful! I really like this, it's very flexible.
> > > >>
> > > >> Where should devm_drm_mode_config_reset() live? It will pull in the
> > > >> atomic helper...
> > > >
> > > > I think a new drm_devm.c helper would be nice for all this stuff.
> > > > Especially since you can't freely mix devm-based setup/cleanup with
> > > > normal cleanup I think it'd be good to have it all together in one
> > > > place. And perhaps even a code example in the DOC: overview.
> > > >
> > > >>> We'd need to make sure some of the cleanup actions dtrt when the device is
> > > >>> gone, but I think we can achieve that by liberally sprinkling
> > > >>> drm_dev_enter/exit over them, e.g. the the cleanup action for
> > > >>> drm_mode_config_reset would be:
> > > >>>
> > > >>> {
> > > >>>       if (drm_dev_enter())
> > > >>>               return;
> > > >>>
> > > >>>       drm_atomic_helper_shutdown();
> > > >>>
> > > >>>       drm_dev_exit();
> > > >>> }
> > > >>>
> > > >>
> > > >> drm_dev_enter() can only be used to check whether the drm_device is
> > > >> registered or not, it doesn't say anything about the state of the parent
> > > >> device.
> > > >>
> > > >> All we know is that the device is being unbound from the driver, we
> > > >> don't know if it's the device that's being removed or if it's the driver
> > > >> that's unregistered.
> > > >
> > > > You're right, both paths will have called drm_dev_unplug by then.
> > > > Silly me. I really liked my idea :-)
> > > >
> > > >> I have looked at the various call chains:
> > > >>
> > > >> driver_unregister ->
> > > >>     bus_remove_driver ->
> > > >>         driver_detach ->
> > > >>             device_release_driver_internal
> > > >>
> > > >> device_unregister ->
> > > >>     device_del ->
> > > >>         bus_remove_device ->
> > > >>             device_release_driver ->
> > > >>                 device_release_driver_internal
> > > >>
> > > >> sysfs: unbind_store ->
> > > >>     device_release_driver ->
> > > >>         device_release_driver_internal
> > > >>
> > > >> The only way I've found to differentiate between these in a cleanup
> > > >> action is that device_del() uses the bus notifier to signal
> > > >> BUS_NOTIFY_DEL_DEVICE before calling bus_remove_device(). Such a
> > > >> notifier could be used to set a drm_device->parent_removed flag.
> > > >
> > > > Hm, this might upset Greg KH's code taste ... maybe there's a better
> > > > way to do this, but best to prototype a patch with this, send it to
> > > > him and ask how to :-)
> > > >
> > >
> > > I'll leave this to the one that needs it. The tinydrm drivers doesn't
> > > need to touch hw after DRM unregister.
> > >
> > > >> Why is it necessary to call drm_atomic_helper_shutdown() here? Doesn't
> > > >> everything get disabled when userspace closes? It does in my tinydrm
> > > >> world :-)
> > > >
> > > > Iirc fbdev/fbcon can result in leaks ... at least we've had patches
> > > > where drivers leaked drm_connector and drm_framebuffer objects, and
> > > > they've been fixed by calling drm_atomic_helper_shutdown() in the
> > > > unload path. Maybe this is cargo-culting, but it goes way back to
> > > > pre-atomic, where drivers called drm_helper_force_disable_all().
> > > >
> > > > If you try to move the fbcon to your tinydrm drivers (con2fb is
> > > > apparently the cmdline tool you need, never tried it, I only switch
> > > > the kernel's console between fbcon and dummycon and back, not what
> > > > fbcon drivers itself), then I think you should be able to reproduce.
> > > > And maybe you have a better idea how to deal with this all.
> > > >
> > > > Note also that there's been proposals floating around to only close an
> > > > drm_framebuffer, not also remove it (like the current RMFB ioctl
> > > > does), with that closing userspace would not necessarily lead to a
> > > > full cleanup.
> > > >
> > > > Another thing (which doesn't apply to drm_simple_display_pipe drivers)
> > > > is if you have the display on, but no planes showing (i.e. all black).
> > > > Then all the fbs will be cleaned up, but drm_connector will be
> > > > leaking. That's a case where you need drm_atomic_helper_shutdown()
> > > > even if fbcon/fbdev isn't even enabled.
> > > >
> > >
> > > Ok, this means that I don't need to call drm_atomic_helper_shutdown() in
> > > tinydrm. DRM userspace disables the pipe on close and the generic fbdev
> > > emulation also releases everything.
> > > Even so, maybe I should use devm_drm_mode_config_reset() after all to
> > > keep drivers uniform, to avoid confusion: why doesn't he use it?
> >
> > Hm maybe there is an official way to solve this, pulling in Greg+Rafael.
> >
> > Super short summary: We want to start using devm actions to clean up drm
> > drivers. Here's the problem:
> > - For a driver unload/unbind without hotunplug, we want to properly clean
> >   up the hardware and shut it all down.
>
> Then do it on probe/disconnect.
>
> > - But if the device is unplugged already, that's probably not the best
> >   idea, and we only want to clean up the kernel's resources/allocations.
>
> Again, probe/disconnect will be called either way.
>
> But as you note, memory will NOT be freed by the devm stuff if you
> manually unbind a driver from a device.
>
> So don't touch hardware there, it's not going to work :)
>
> > What's the recommendation here? I see a few options:
> >
> > - Make sure everything can deal with this properly. Hotunplug can happen
> >   anytime, so there's a race no matter what.
>
> Yes.
>
> > - Check with the device model whether the struct device is disappearing or
> >   whether we're just dealing with a driver unbind (no idea how to do
> >   that), and act accordingly.
>
> You don't know that, sorry.  Just do any hardware stuff on disconnect.
> Assuming your hardware is still present :)
>
> > - Fundamental question: Touching the hw from devm actions, is that ok? If
> >   not, then the pretty nifty plan laid out in this thread wont work.
>
> Nope, that's not going to work, the device could either be long gone, or
> you will not be called due to unbind happening from userspace.
>
> But really, unbind from userspace is very very rare, it's a developer
> thing mostly.  Oh and a virtual driver thing, but those people are crazy
> :)
>
> > - Something completely different?
>
> Do it in disconnect :)

Ah, I forgot to mention the important constraint :-) disconnect/unbind
should be the following sequence:

1. Unregister all the userspace interfaces (there's a lot of them) and
make sure all the pending ioctls are done so that from now on
userspace sees lots of -EIO (in case it still has fd open, which is
going to be the normal for hotunplug.

2. Shut down hw and all ongoing operations (only relevant for unbind,
but needs to be able to cope with sudden hotunplug on top anyway).

3. Clean up the kernel mess and release everything.

Probe is exactly the other way round, so would perfectly fit into the
devm onion cleanup. See in the commented earlier replies above how
that would match in details, but tldr; if we have to do 2. in
disconnect, then we also have to do 1. in disconnected, and only doing
3. through devm is almost not worth the bother. But if we could do all
three through devm then simple drivers wouldn't even need any
disconnect/unbind callback at all. That's our motivation for trying to
come up with an answer that's not "do it in disconnect". "do it in
disconnect" is how we do it all today already.

Yes we're trying to make tiny drivers even smaller, we have enough
nowadays that this stuff would be worth it :-)

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux