Re: [Intel-gfx] [PATCH 16/26] drm/i915: use new iterator in i915_gem_object_wait_reservation v2

Christian König <ckoenig.leichtzumerken@xxxxxxxxx> · Tue, 21 Sep 2021 19:35:53 +0200

Am 20.09.21 um 12:00 schrieb Tvrtko Ursulin:

On 17/09/2021 13:35, Christian König wrote:
Simplifying the code a bit.

v2: add missing rcu read unlock.

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
---
  drivers/gpu/drm/i915/gem/i915_gem_wait.c | 57 ++++++------------------
  1 file changed, 14 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index f909aaa09d9c..e416cf528635 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -37,55 +37,26 @@ i915_gem_object_wait_reservation(struct dma_resv 
*resv,
                   unsigned int flags,
                   long timeout)
  {
-    struct dma_fence *excl;
-    bool prune_fences = false;
-
-    if (flags & I915_WAIT_ALL) {
-        struct dma_fence **shared;
-        unsigned int count, i;
-        int ret;
-
-        ret = dma_resv_get_fences(resv, &excl, &count, &shared);
-        if (ret)
-            return ret;
-
-        for (i = 0; i < count; i++) {
-            timeout = i915_gem_object_wait_fence(shared[i],
-                                 flags, timeout);
-            if (timeout < 0)
-                break;
-
-            dma_fence_put(shared[i]);
-        }
-
-        for (; i < count; i++)
-            dma_fence_put(shared[i]);
-        kfree(shared);
-
-        /*
-         * If both shared fences and an exclusive fence exist,
-         * then by construction the shared fences must be later
-         * than the exclusive fence. If we successfully wait for
-         * all the shared fences, we know that the exclusive fence
-         * must all be signaled. If all the shared fences are
-         * signaled, we can prune the array and recover the
-         * floating references on the fences/requests.
-         */
-        prune_fences = count && timeout >= 0;
-    } else {
-        excl = dma_resv_get_excl_unlocked(resv);
+    struct dma_resv_iter cursor;
+    struct dma_fence *fence;
+
+    rcu_read_lock();
+    dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL);
+    dma_resv_for_each_fence_unlocked(&cursor, fence) {
+        rcu_read_unlock();
+        timeout = i915_gem_object_wait_fence(fence, flags, timeout);

Converting this one could be problematic. It's the wait ioctl which 
used to grab an atomic snapshot and wait for that rendering to 
complete. With this change I think it has the potential to run forever 
keeps catching new activity against the same object.

I am not sure whether or not the difference is relevant for how 
userspace uses it but I think needs discussion.

It was years ago, but IIRC we had the same discussion for the 
dma_resv_wait_timeout() function and the result was that this is not a 
valid use case and waiting forever if you see new work over and over 
again is a valid result.

Let me double check the history of this code here as well.

Hm actually there are internal callers as well, and at least some of 
those have the object locked. Would a wider refactoring to separate 
those into buckets (locked vs unlocked) make sense?

Yes definitely.

Regards,
Christian.


Regards,

Tvrtko


+        rcu_read_lock();
+        if (timeout < 0)
+            break;
      }
-
-    if (excl && timeout >= 0)
-        timeout = i915_gem_object_wait_fence(excl, flags, timeout);
-
-    dma_fence_put(excl);
+    dma_resv_iter_end(&cursor);
+    rcu_read_unlock();
        /*
       * Opportunistically prune the fences iff we know they have 
*all* been
       * signaled.
       */
-    if (prune_fences)
+    if (timeout > 0)
          dma_resv_prune(resv);
        return timeout;