Re: [PATCH v6 16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5.

Thomas Hellström (Intel) <thomas_os@xxxxxxxxxxxx> · Wed, 20 Jan 2021 14:32:42 +0100

On 1/18/21 1:55 PM, Thomas Hellström (Intel) wrote:

On 1/18/21 1:43 PM, Maarten Lankhorst wrote:
Op 18-01-2021 om 12:30 schreef Thomas Hellström (Intel):
Hi,

On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
Instead of doing what we do currently, which will never work with
PROVE_LOCKING, do the same as AMD does, and something similar to
relocation slowpath. When all locks are dropped, we acquire the
pages for pinning. When the locks are taken, we transfer those
pages in .get_pages() to the bo. As a final check before installing
the fences, we ensure that the mmu notifier was not called; if it is,
we return -EAGAIN to userspace to signal it has to start over.

Changes since v1:
- Unbinding is done in submit_init only. submit_begin() removed.
- MMU_NOTFIER -> MMU_NOTIFIER
Changes since v2:
- Make i915->mm.notifier a spinlock.
Changes since v3:
- Add WARN_ON if there are any page references left, should have 
been 0.
- Return 0 on success in submit_init(), bug from spinlock conversion.
- Release pvec outside of notifier_lock (Thomas).
Changes since v4:
- Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
- Actually check all invalidations in eb_move_to_gpu. (Thomas)
- Do not wait when process is exiting to fix 
gem_ctx_persistence.userptr.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>

...

   -static int
-userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
-                  const struct mmu_notifier_range *range)
-{
-    struct i915_mmu_notifier *mn =
-        container_of(_mn, struct i915_mmu_notifier, mn);
-    struct interval_tree_node *it;
-    unsigned long end;
-    int ret = 0;
-
-    if (RB_EMPTY_ROOT(&mn->objects.rb_root))
-        return 0;
-
-    /* interval ranges are inclusive, but invalidate range is 
exclusive */
-    end = range->end - 1;
-
-    spin_lock(&mn->lock);
-    it = interval_tree_iter_first(&mn->objects, range->start, end);
-    while (it) {
-        struct drm_i915_gem_object *obj;
-
-        if (!mmu_notifier_range_blockable(range)) {
-            ret = -EAGAIN;
-            break;
-        }
+    spin_lock(&i915->mm.notifier_lock);
   -        /*
-         * The mmu_object is released late when destroying the
-         * GEM object so it is entirely possible to gain a
-         * reference on an object in the process of being freed
-         * since our serialisation is via the spinlock and not
-         * the struct_mutex - and consequently use it after it
-         * is freed and then double free it. To prevent that
-         * use-after-free we only acquire a reference on the
-         * object if it is not in the process of being destroyed.
-         */
-        obj = container_of(it, struct i915_mmu_object, it)->obj;
-        if (!kref_get_unless_zero(&obj->base.refcount)) {
-            it = interval_tree_iter_next(it, range->start, end);
-            continue;
-        }
-        spin_unlock(&mn->lock);
+    mmu_interval_set_seq(mni, cur_seq);
   -        ret = i915_gem_object_unbind(obj,
-                         I915_GEM_OBJECT_UNBIND_ACTIVE |
-                         I915_GEM_OBJECT_UNBIND_BARRIER);
-        if (ret == 0)
-            ret = __i915_gem_object_put_pages(obj);
-        i915_gem_object_put(obj);
-        if (ret)
-            return ret;
+    spin_unlock(&i915->mm.notifier_lock);
   -        spin_lock(&mn->lock);
+    /* During exit there's no need to wait */
+    if (current->flags & PF_EXITING)
+        return true;
Did we ever find out why this is needed, that is why the old userptr 
invalidation called doesn't hang here in a similar way?
It's an optimization for teardown because userptr will be invalidated 
anyway, but also for gem_ctx_persistence.userptr, although

with ulls that test may stop working anyway because it takes an 
out_fence.

Sure, but what I meant was: Did we find out what's different in the 
new code compared to the old one? Because the old code also waits for 
gpu when unbinding in the mmu_notifier, but it appears like in the old 
code, the mmu notifier is never called here. At least to me it seems 
it would be good if we understand what that difference is.

/Thomas

IIRC I did some investigation here as well and from what I could tell, 
the notifier was not called at all for the old code. I don't really feel 
comfortable with an R-B until we've really understood why.

Also there was a discussion with you, Sudeep and Chris about whether 
user-space was actually not comfortable with this, you saying it worked, 
Chris said tests were showing otherwise. What were those tests?

Thanks,

Thomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx