Re: [PATCH 13/26] drm/i915: use the new iterator in i915_gem_busy_ioctl

Christian König <ckoenig.leichtzumerken@xxxxxxxxx> · Wed, 22 Sep 2021 14:15:33 +0200

Am 22.09.21 um 13:46 schrieb Tvrtko Ursulin:

On 22/09/2021 11:21, Tvrtko Ursulin wrote:

On 22/09/2021 10:10, Christian König wrote:
This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
---
  drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 
++++++++++--------------
  1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 6234e17259c1..313afb4a11c7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void 
*data,
  {
      struct drm_i915_gem_busy *args = data;
      struct drm_i915_gem_object *obj;
-    struct dma_resv_list *list;
-    unsigned int seq;
+    struct dma_resv_iter cursor;
+    struct dma_fence *fence;
      int err;
      err = -ENOENT;
@@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, 
void *data,
       * to report the overall busyness. This is what the wait-ioctl 
does.
       *
       */
-retry:
-    seq = raw_read_seqcount(&obj->base.resv->seq);
-
-    /* Translate the exclusive fence to the READ *and* WRITE engine */
-    args->busy = 
busy_check_writer(dma_resv_excl_fence(obj->base.resv));
-
-    /* Translate shared fences to READ set of engines */
-    list = dma_resv_shared_list(obj->base.resv);
-    if (list) {
-        unsigned int shared_count = list->shared_count, i;
-
-        for (i = 0; i < shared_count; ++i) {
-            struct dma_fence *fence =
-                rcu_dereference(list->shared[i]);
-
+    args->busy = false;

You can drop this line, especially since it is not a boolean. With that:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Having said this, one thing to add in the commit message is some 
commentary that although simpler in code, the new implementation has a 
lot more atomic instructions due all the extra fence get/put.

Saying this because I remembered busy ioctl is quite an over-popular 
one. Thinking about traces from some real userspaces I looked at in 
the past.

So I think ack from maintainers will be required here. Because I just 
don't know if any performance impact will be visible or not. So view 
my r-b as "code looks fine" but I am on the fence if it should 
actually be merged. Probably leaning towards no actually - given how 
the code is localised here and I dislike burdening old platforms with 
more CPU time it could be cheaply left as is.

Well previously we would have allocated memory, which as far as I know 
has more overhead than a few extra atomic operations.

On the other hand I could as well stick with dma_resv_get_fences() here.

Regards,
Christian.


Regards,

Tvrtko



Regards,

Tvrtko

+    dma_resv_iter_begin(&cursor, obj->base.resv, true);
+    dma_resv_for_each_fence_unlocked(&cursor, fence) {
+        if (dma_resv_iter_is_restarted(&cursor))
+            args->busy = 0;
+
+        if (dma_resv_iter_is_exclusive(&cursor))
+            /* Translate the exclusive fence to the READ *and* 
WRITE engine */
+            args->busy |= busy_check_writer(fence);
+        else
+            /* Translate shared fences to READ set of engines */
              args->busy |= busy_check_reader(fence);
-        }
      }
-
-    if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq))
-        goto retry;
+    dma_resv_iter_end(&cursor);
      err = 0;
  out: