On Fri, Oct 26, 2018 at 11:02 AM Daniel Vetter <daniel@xxxxxxxx> wrote: > > On Thu, Oct 25, 2018 at 9:39 PM Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > > > > Quoting Daniel Vetter (2018-10-25 20:16:50) > > > On Thu, Oct 25, 2018 at 01:45:42PM +0100, Chris Wilson wrote: > > > > Since commit 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu > > > > notifiers") we have been able to report failure from > > > > mmu_invalidate_range_start which allows us to use a trylock on the > > > > struct_mutex to avoid potential recursion and report -EBUSY instead. > > > > Furthermore, this allows us to pull the work into the main callback and > > > > avoid the sleight-of-hand in using a workqueue to avoid lockdep. > > > > > > > > However, not all paths to mmu_invalidate_range_start are prepared to > > > > handle failure, so instead of reporting the recursion, deal with it. > > > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108375 > > > > References: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > > --- > > > > drivers/gpu/drm/i915/i915_drv.h | 4 +- > > > > drivers/gpu/drm/i915/i915_gem.c | 12 +- > > > > drivers/gpu/drm/i915/i915_gem_userptr.c | 219 ++++++++++++------------ > > > > 3 files changed, 115 insertions(+), 120 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > > index 2d7761b8ac07..c3b94c3f930f 100644 > > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > > @@ -3076,8 +3076,8 @@ enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock */ > > > > I915_MM_SHRINKER > > > > }; > > > > > > > > -void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > > > - enum i915_mm_subclass subclass); > > > > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > > > + enum i915_mm_subclass subclass); > > > > void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj); > > > > > > > > enum i915_map_type { > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > > > index 93d09282710d..de7ccb3eb7b8 100644 > > > > --- a/drivers/gpu/drm/i915/i915_gem.c > > > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > > > @@ -2454,17 +2454,18 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) > > > > return pages; > > > > } > > > > > > > > -void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > > > - enum i915_mm_subclass subclass) > > > > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > > > + enum i915_mm_subclass subclass) > > > > { > > > > struct sg_table *pages; > > > > + int ret = -EBUSY; > > > > > > > > if (i915_gem_object_has_pinned_pages(obj)) > > > > - return; > > > > + return -EBUSY; > > > > > > > > GEM_BUG_ON(obj->bind_count); > > > > if (!i915_gem_object_has_pages(obj)) > > > > - return; > > > > + return 0; > > > > > > > > /* May be called by shrinker from within get_pages() (on another bo) */ > > > > mutex_lock_nested(&obj->mm.lock, subclass); > > > > @@ -2480,8 +2481,11 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, > > > > if (!IS_ERR(pages)) > > > > obj->ops->put_pages(obj, pages); > > > > > > > > + ret = 0; > > > > unlock: > > > > mutex_unlock(&obj->mm.lock); > > > > + > > > > + return ret; > > > > } > > > > > > > > bool i915_sg_trim(struct sg_table *orig_st) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c > > > > index 2c9b284036d1..a8f3c304db55 100644 > > > > --- a/drivers/gpu/drm/i915/i915_gem_userptr.c > > > > +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c > > > > @@ -50,79 +50,84 @@ struct i915_mmu_notifier { > > > > struct hlist_node node; > > > > struct mmu_notifier mn; > > > > struct rb_root_cached objects; > > > > - struct workqueue_struct *wq; > > > > + struct i915_mm_struct *mm; > > > > }; > > > > > > > > struct i915_mmu_object { > > > > struct i915_mmu_notifier *mn; > > > > struct drm_i915_gem_object *obj; > > > > struct interval_tree_node it; > > > > - struct list_head link; > > > > - struct work_struct work; > > > > - bool attached; > > > > }; > > > > > > > > -static void cancel_userptr(struct work_struct *work) > > > > -{ > > > > - struct i915_mmu_object *mo = container_of(work, typeof(*mo), work); > > > > - struct drm_i915_gem_object *obj = mo->obj; > > > > - struct work_struct *active; > > > > - > > > > - /* Cancel any active worker and force us to re-evaluate gup */ > > > > - mutex_lock(&obj->mm.lock); > > > > - active = fetch_and_zero(&obj->userptr.work); > > > > - mutex_unlock(&obj->mm.lock); > > > > - if (active) > > > > - goto out; > > > > - > > > > - i915_gem_object_wait(obj, I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT, NULL); > > > > - > > > > - mutex_lock(&obj->base.dev->struct_mutex); > > > > - > > > > - /* We are inside a kthread context and can't be interrupted */ > > > > - if (i915_gem_object_unbind(obj) == 0) > > > > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > > > > - WARN_ONCE(i915_gem_object_has_pages(obj), > > > > - "Failed to release pages: bind_count=%d, pages_pin_count=%d, pin_global=%d\n", > > > > - obj->bind_count, > > > > - atomic_read(&obj->mm.pages_pin_count), > > > > - obj->pin_global); > > > > - > > > > - mutex_unlock(&obj->base.dev->struct_mutex); > > > > - > > > > -out: > > > > - i915_gem_object_put(obj); > > > > -} > > > > - > > > > static void add_object(struct i915_mmu_object *mo) > > > > { > > > > - if (mo->attached) > > > > + if (!RB_EMPTY_NODE(&mo->it.rb)) > > > > return; > > > > > > > > interval_tree_insert(&mo->it, &mo->mn->objects); > > > > - mo->attached = true; > > > > } > > > > > > > > static void del_object(struct i915_mmu_object *mo) > > > > { > > > > - if (!mo->attached) > > > > + if (RB_EMPTY_NODE(&mo->it.rb)) > > > > return; > > > > > > > > interval_tree_remove(&mo->it, &mo->mn->objects); > > > > - mo->attached = false; > > > > + RB_CLEAR_NODE(&mo->it.rb); > > > > +} > > > > + > > > > +static void > > > > +__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value) > > > > +{ > > > > + struct i915_mmu_object *mo = obj->userptr.mmu_object; > > > > + > > > > + /* > > > > + * During mm_invalidate_range we need to cancel any userptr that > > > > + * overlaps the range being invalidated. Doing so requires the > > > > + * struct_mutex, and that risks recursion. In order to cause > > > > + * recursion, the user must alias the userptr address space with > > > > + * a GTT mmapping (possible with a MAP_FIXED) - then when we have > > > > + * to invalidate that mmaping, mm_invalidate_range is called with > > > > + * the userptr address *and* the struct_mutex held. To prevent that > > > > + * we set a flag under the i915_mmu_notifier spinlock to indicate > > > > + * whether this object is valid. > > > > + */ > > > > + if (!mo) > > > > + return; > > > > + > > > > + spin_lock(&mo->mn->lock); > > > > + if (value) > > > > + add_object(mo); > > > > + else > > > > + del_object(mo); > > > > + spin_unlock(&mo->mn->lock); > > > > +} > > > > + > > > > +static struct mutex *__i915_mutex_lock_recursive(struct mutex *m) > > > > +{ > > > > + switch (mutex_trylock_recursive(m)) { > > > > + default: > > > > + case MUTEX_TRYLOCK_FAILED: > > > > + mutex_lock(m); > > > > + case MUTEX_TRYLOCK_SUCCESS: > > > > + return m; > > > > + > > > > + case MUTEX_TRYLOCK_RECURSIVE: > > > > + return ERR_PTR(-EEXIST); > > > > + } > > > > } > > > > > > > > static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > > - struct mm_struct *mm, > > > > - unsigned long start, > > > > - unsigned long end, > > > > - bool blockable) > > > > + struct mm_struct *mm, > > > > + unsigned long start, > > > > + unsigned long end, > > > > + bool blockable) > > > > { > > > > struct i915_mmu_notifier *mn = > > > > container_of(_mn, struct i915_mmu_notifier, mn); > > > > - struct i915_mmu_object *mo; > > > > struct interval_tree_node *it; > > > > - LIST_HEAD(cancelled); > > > > + struct mutex *unlock = NULL; > > > > + int ret = 0; > > > > > > > > if (RB_EMPTY_ROOT(&mn->objects.rb_root)) > > > > return 0; > > > > @@ -133,11 +138,16 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > > spin_lock(&mn->lock); > > > > it = interval_tree_iter_first(&mn->objects, start, end); > > > > while (it) { > > > > + struct drm_i915_gem_object *obj; > > > > + bool has_pages = false; > > > > + > > > > if (!blockable) { > > > > - spin_unlock(&mn->lock); > > > > - return -EAGAIN; > > > > + ret = -EAGAIN; > > > > + break; > > > > } > > > > - /* The mmu_object is released late when destroying the > > > > + > > > > + /* > > > > + * The mmu_object is released late when destroying the > > > > * GEM object so it is entirely possible to gain a > > > > * reference on an object in the process of being freed > > > > * since our serialisation is via the spinlock and not > > > > @@ -146,21 +156,44 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > > * use-after-free we only acquire a reference on the > > > > * object if it is not in the process of being destroyed. > > > > */ > > > > - mo = container_of(it, struct i915_mmu_object, it); > > > > - if (kref_get_unless_zero(&mo->obj->base.refcount)) > > > > - queue_work(mn->wq, &mo->work); > > > > + obj = container_of(it, struct i915_mmu_object, it)->obj; > > > > + if (!kref_get_unless_zero(&obj->base.refcount)) { > > > > + it = interval_tree_iter_next(it, start, end); > > > > + continue; > > > > + } > > > > + spin_unlock(&mn->lock); > > > > > > > > - list_add(&mo->link, &cancelled); > > > > - it = interval_tree_iter_next(it, start, end); > > > > + /* Cancel any pending worker and force us to re-evaluate gup */ > > > > + mutex_lock_nested(&obj->mm.lock, I915_MM_SHRINKER); > > > > > > mutex_lock_nested is meant for static nesting. This here is rather > > > dynamic, and I don't really see where the guarantee is that the classic > > > deadlock isn't possible: > > > > It's meant to be after put_pages == 0 to ensure it only applied to a > > different object. A bit more involved to do it exactly how I want, that > > is to basically incorporate it into put_pages itself. Actually, if it is > > in the put_pages it should just work... > > So for the shrinker I guess you can make this work. For my taste > there's not enough GEM_WARN_ON to check the invariants of the nesting > scheme (I assume it is list_empty(obj->mm.link) -> use I915_MM_NORMAL, > and I915_MM_SHRINKER for anyone who discovers an object through the > obj->mm.link lists, but would be nice to double-check that every time > we take the obj->mm.lock with a GEM_WARN_ON). After a bit more reading around I think this isn't enough, and we also rely on list_del(obj->mm.link) being serialized by dev->struct_mutex. But I'm not entirely sure. We definitely seem to rely on the magic "holding dev->struct_mutex extends the lifetime of anything with a non-empty obj->mm.link" on the shrinker side still. -Daniel > But for the mmu notifier I'm not seeing that same guarantee, and > didn't spot any other, so not clear why this works. And assuming I'm > reading the code correctly, we do insert the object into the rb tree > already when gup worker is still doing it's thing. > > Or maybe there's another invariant that guarnatees the nesting, but I > checked a few and they all don't seem to work (like pages_pin_count). > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx