On Tue, Jan 28, 2020 at 11:00:32AM +0000, Chris Wilson wrote: > Quoting Daniel Vetter (2020-01-28 10:46:01) > > This catches the majority of drivers (unfortunately not if we take > > users into account, because all the big drivers have at least a > > lastclose hook). > > Yeah, that lastclose for switching back to fbcon, and ensuring that > switch is started before the next client takes over the console, does > nerf the ability of avoiding the global_mutex. > > > > > With the prep patches out of the way all drm state is fully protected > > and either prevents or can deal with the races from dropping the BKL > > around open/close. The only thing left to audit are the various driver > > hooks - by keeping the BKL around if any of them are set we have a > > very simple cop-out! > > > > Note that one of the biggest prep pieces to get here was making > > dev->open_count atomic, which was done in > > > > commit 7e13ad896484a0165a68197a2e64091ea28c9602 > > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Date: Fri Jan 24 13:01:07 2020 +0000 > > > > drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count > > > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > > --- > > drivers/gpu/drm/drm_drv.c | 6 +++-- > > drivers/gpu/drm/drm_file.c | 46 ++++++++++++++++++++++++++++++---- > > drivers/gpu/drm/drm_internal.h | 1 + > > 3 files changed, 46 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > index 05bdf0b9d2b3..9fcd6ab3c154 100644 > > --- a/drivers/gpu/drm/drm_drv.c > > +++ b/drivers/gpu/drm/drm_drv.c > > @@ -946,7 +946,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) > > struct drm_driver *driver = dev->driver; > > int ret; > > > > - mutex_lock(&drm_global_mutex); > > + if (drm_dev_needs_global_mutex(dev)) > > + mutex_lock(&drm_global_mutex); > > > > if (dev->driver->load) { > > if (!drm_core_check_feature(dev, DRIVER_LEGACY)) > > @@ -992,7 +993,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) > > drm_minor_unregister(dev, DRM_MINOR_PRIMARY); > > drm_minor_unregister(dev, DRM_MINOR_RENDER); > > out_unlock: > > - mutex_unlock(&drm_global_mutex); > > + if (drm_dev_needs_global_mutex(dev)) > > + mutex_unlock(&drm_global_mutex); > > return ret; > > } > > EXPORT_SYMBOL(drm_dev_register); > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c > > index d36cb74ebe0c..efd6fe0b6b4f 100644 > > --- a/drivers/gpu/drm/drm_file.c > > +++ b/drivers/gpu/drm/drm_file.c > > @@ -51,6 +51,37 @@ > > /* from BKL pushdown */ > > DEFINE_MUTEX(drm_global_mutex); > > > > +bool drm_dev_needs_global_mutex(struct drm_device *dev) > > +{ > > + /* > > + * Legacy drivers rely on all kinds of BKL locking semantics, don't > > + * bother. They also still need BKL locking for their ioctls, so better > > + * safe than sorry. > > + */ > > + if (drm_core_check_feature(dev, DRIVER_LEGACY)) > > + return true; > > + > > + /* > > + * The deprecated ->load callback must be called after the driver is > > + * already registered. This means such drivers rely on the BKL to make > > + * sure an open can't proceed until the driver is actually fully set up. > > + * Similar hilarity holds for the unload callback. > > + */ > > + if (dev->driver->load || dev->driver->unload) > > + return true; > > + > > + /* > > + * Drivers with the lastclose callback assume that it's synchronized > > + * against concurrent opens, which again needs the BKL. The proper fix > > + * is to use the drm_client infrastructure with proper locking for each > > + * client. > > + */ > > + if (dev->driver->lastclose) > > + return true; > > + > > + return false; > > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > I'm not particularly fussed by this patch, maybe a bit too obviously > midlayer. Yeah it's not great. But I think long-term we might have a chance to limit this to DRIVER_LEGACY: - load/unload is on the way out - generic fbdev won't need lastclose - I have an idea for fixing the vgaswitcheroo locking, now that I starred at this wall a bit more. Should be a good undependent cleanup. - we could perhaps do a ->lastclose_unlocked for reducing that midlayer smell a tad. Same way we slowly managed to get rid of the dev->struct_mutex locking midlayer. > I was wondering if we should have a *dev->global_mutex which drivers can > set to be any level they need (with a bit of care on our part to make > sure it is not destroyed across dev->driver->lastclose). I feel like outside of legacy driver the drm_global_mutex protection is limited enough that there's no need to give driver writers ideas :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx