Re: [PATCH 17/50] drm/i915: Switch back to an array of logical per-engine HW contexts

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 12 Apr 2019 15:37:22 +0100





On 12/04/2019 14:43, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-04-12 14:31:31)

On 12/04/2019 09:53, Chris Wilson wrote:
@@ -875,22 +927,27 @@ static int context_barrier_task(struct i915_gem_context *ctx,
       i915_active_init(i915, &cb->base, cb_retire);
       i915_active_acquire(&cb->base);
   
-     rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
-             struct intel_engine_cs *engine = ce->engine;
+     for_each_gem_engine(it, ctx) {
+             struct intel_context *ce = it.context;
               struct i915_request *rq;
   
-             if (!(engine->mask & engines))
+             if (!(ce->engine->mask & engines))
+                     continue;
+
+             if (!intel_context_is_pinned(ce))
                       continue;
   
               if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-                                    engine->mask)) {
+                                    ce->engine->mask)) {
                       err = -ENXIO;
+                     i915_gem_engines_iter_fini(&it);

This would be our only iterator which needs manual cleanup. But only on
abort, not normal termination.

Would it be cleaner if for_each_gem_engine took engines which the caller
would previously have to obtain via i915_gem_context_engine_list_lock?
And the caller would also be obviously responsible for unlocking.

But then it's just a for (i = 0; i < engines->num_engines; i++) again.
I'm not really convinced that if the caller has to do everything anyway,
replacing the for() is much of a gain :)

                       break;
               }
   
-             rq = i915_request_alloc(engine, ctx);
+             rq = intel_context_create_request(ce);
               if (IS_ERR(rq)) {
                       err = PTR_ERR(rq);
+                     i915_gem_engines_iter_fini(&it);
                       break;
               }
   
@@ -901,8 +958,10 @@ static int context_barrier_task(struct i915_gem_context *ctx,
                       err = i915_active_ref(&cb->base, rq->fence.context, rq);
   
               i915_request_add(rq);
-             if (err)
+             if (err) {
+                     i915_gem_engines_iter_fini(&it);
                       break;
+             }
       }
   
       cb->task = err ? NULL : task; /* caller needs to unwind instead */
@@ -1739,6 +1798,43 @@ int __i915_gem_context_pin_hw_id(struct i915_gem_context *ctx)
       return err;
   }
   
+/* GEM context-engines iterator: for_each_gem_engine() */
+bool i915_gem_engines_iter_next(struct i915_gem_engines_iter *it)
+{
+     if (!it->engines)
+             goto end;

When can engines be null? Should it be GEM_BUG_ON if onyl when used
outside the for_each_gem_engine?

Hmm, it can't. I seem to have been unusually defensive!

+     do {
+             if (it->idx >= it->engines->num_engines)
+                     goto end;
+
+             it->context = it->engines->engines[it->idx++];
+     } while (!it->context);
+
+     return true;
+
+end:
+     i915_gem_engines_iter_fini(it);
+     return false;
+}
+
+void i915_gem_engines_iter_init(struct i915_gem_engines_iter *it,
+                             struct i915_gem_context *ctx)
+{
+     it->gem_context = ctx;
+     it->engines = i915_gem_context_engine_list_lock(ctx);
+     it->idx = 0;

Clear it->context just in case?

+}
+
+void i915_gem_engines_iter_fini(struct i915_gem_engines_iter *it)
+{
+     struct i915_gem_context *ctx;
+
+     ctx = fetch_and_zero(&it->gem_context);

This is iter_fini can be called multiple times? When does that come to
play? Oh... when i915_gem_engines_iter_next went past the array and the
loop decides to bail out.

Yeah, I was just worrying about the mix of the user calling
	iter_fini();
	continue;

If iterator took engines as parameter this could also go.

Equally if there was no iterator :)

So is the iterator worth it? If I have to do the locking and list management
in the caller, I would say no. Yay or nay?

I think yes. My arguments are:

1. Removes the need for every loop to add a skip on !ce.
2. Removes mixing up two different concepts - device and context engines.
3. With the virtual engine added for_each_ctx_engine also walks over 
that one. Don't think we have a use for this yet though.

I am just not sure if like the one in your patch or a simpler one which 
leaves obtaining/releasing the engine list to the caller. I am leaning 
towards the latter because even though it is extra work, so is managing 
the iter_fini requirement. So they end up almost equal in this respect, 
but simpler version more explicit and so harder to mess up.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx