On Sat, Dec 10, 2016 at 10:36 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > On Sat, Dec 10, 2016 at 09:28:02PM +0000, Chris Wilson wrote: >> On Sat, Dec 10, 2016 at 09:23:35PM +0000, Chris Wilson wrote: >> > On Sat, Dec 10, 2016 at 10:19:30PM +0100, Daniel Vetter wrote: >> > > On Fri, Dec 09, 2016 at 04:52:32PM +0000, Chris Wilson wrote: >> > > > With prime, we are running into false circular dependencies based on the >> > > > order in which two drivers may lock their own struct_mutex wrt to a >> > > > common lock (like the reservation->lock). Work around this by adding the >> > > > lock_class_key to the struct drm_driver such that each driver can have >> > > > its own subclass of struct_mutex. Circular dependencies between drivers >> > > > will now be ignored, but real circular dependencies on any one mutex >> > > > will still be caught. A driver creating more than one device will still >> > > > need to be careful! >> > > > >> > > > Reported-by: Tobias Klausmann <tobias.johannes.klausmann@xxxxxxxxxx> >> > > > Reported-by: Hans de Goede <hdegoede@xxxxxxxxxx> >> > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> > > >> > > Where does this even happen? i915, msm and udl are the only drivers left >> > > over that do struct_mutex, and i915 can't really share buffers with msm, >> > > and udl doesn't do reservations. How exactly does this still go boom in >> > > latest upstream? >> > >> > How about cc: stable? >> > >> > The reports are nouveau vs i915. I was quite pleased with the >> > drm_driver_class! >> >> Ah, you may have removed any direct calls to struct_mutex from nouveau, >> but it is still using struct_mutex around its GEM bo references. >> >> git grep drm_gem_object_unreference_unlocked -- drivers/gpu/drm/nouveau/ | wc -l >> 13 > > Either s/drm_gem_object_unreference_unlocked/__drm_gem_object_unreference/ > > Or > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index 465bacd0a630..824a7780de06 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -826,11 +826,13 @@ drm_gem_object_unreference_unlocked(struct drm_gem_object *obj) > return; > > dev = obj->dev; > - might_lock(&dev->struct_mutex); > - > - if (dev->driver->gem_free_object_unlocked) > + if (dev->driver->gem_free_object_unlocked) { > kref_put(&obj->refcount, drm_gem_object_free); > - else if (kref_put_mutex(&obj->refcount, drm_gem_object_free, > + return; > + } > + > + might_lock(&dev->struct_mutex); > + if (kref_put_mutex(&obj->refcount, drm_gem_object_free, > &dev->struct_mutex)) > mutex_unlock(&dev->struct_mutex); > } > > That's a false might_lock() that really should be pushed to kref_put_mutex() My thinking behind adding that might_lock was to make sure core code is still save for drivers which rely upon the struct mutex, by essentially enlisting _all_ drivers to validate this. Given that struct_mutex is indeed on its demise, or at least gem_free_object_unlocked is popular enough maybe time to change that? I.e. I like your diff here ;-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx