Re: [PATCH 31/32] drm/i915/execlists: Virtual engine bonding

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Thu, 18 Apr 2019 10:50:30 +0100

On 18/04/2019 10:13, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-04-18 09:57:43)

On 18/04/2019 07:57, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-04-18 07:47:51)

On 17/04/2019 08:56, Chris Wilson wrote:
+static void
+virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+{
+     struct virtual_engine *ve = to_virtual_engine(rq->engine);
+     struct ve_bond *bond;
+
+     bond = virtual_find_bond(ve, to_request(signal)->engine);
+     if (bond) {
+             intel_engine_mask_t old, new, cmp;
+
+             cmp = READ_ONCE(rq->execution_mask);
+             do {
+                     old = cmp;
+                     new = cmp & bond->sibling_mask;
+             } while ((cmp = cmpxchg(&rq->execution_mask, old, new)) != old);

Loop implies someone else might be modifying the rq->execution_mask in
parallel?

There's nothing that prevents there being multiple bonds being
executed simultaneously (other than practicality). There's also nothing
that says this should be the only way to modify rq->execution_mask in
the future.

But request is one, how can it be submitted multiple times simultaneously?

You mean "How can it be signaled multiple times simultaneously?"

Okay yes, signaled. You could give same submit fence to multiple slaves, 
but you can't have same slave request receive notification from multiple 
masters.

Or you can if you build a composite fence and pass that in? Is this the 
story about signal-on-any vs signal-on-all?

+static int
+set_engines__bond(struct i915_user_extension __user *base, void *data)
+{
+     struct i915_context_engines_bond __user *ext =
+             container_of_user(base, typeof(*ext), base);
+     const struct set_engines *set = data;
+     struct intel_engine_cs *virtual;
+     struct intel_engine_cs *master;
+     u16 class, instance;
+     u16 idx, num_bonds;
+     int err, n;
+
+     if (get_user(idx, &ext->virtual_index))
+             return -EFAULT;
+
+     if (idx >= set->engines->num_engines) {
+             DRM_DEBUG("Invalid index for virtual engine: %d >= %d\n",
+                       idx, set->engines->num_engines);
+             return -EINVAL;
+     }
+
+     idx = array_index_nospec(idx, set->engines->num_engines);
+     if (!set->engines->engines[idx]) {
+             DRM_DEBUG("Invalid engine at %d\n", idx);
+             return -EINVAL;
+     }
+
+     /*
+      * A non-virtual engine has 0 siblings to choose between; and submit
+      * fence will always be directed to the one engine.
+      */
+     virtual = set->engines->engines[idx]->engine;
+     if (!intel_engine_is_virtual(virtual))
+             return 0;

Hmm wouldn't we strictly speaking need to distinguish between uAPI
errors and auto-magic-single-veng-replacement? Latter is OK to return
success, but former should be reported as -EINVAL I think.

Is it a uAPI error if it works? :)

It works but what is the practical use? It more signals userspace got
it's configuration wrong and if we silently accept it gets more
difficult to figure out.

At that point, I was being facetious. Memory says it was simpler to just
stick the virtual check at the start than have to insert it later. But
it's trivial to move later, so it's done.

+
+     err = check_user_mbz(&ext->flags);
+     if (err)
+             return err;
+
+     for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+             err = check_user_mbz(&ext->mbz64[n]);
+             if (err)
+                     return err;
+     }
+
+     if (get_user(class, &ext->master_class))
+             return -EFAULT;
+
+     if (get_user(instance, &ext->master_instance))
+             return -EFAULT;
+
+     master = intel_engine_lookup_user(set->ctx->i915, class, instance);
+     if (!master) {
+             DRM_DEBUG("Unrecognised master engine: { class:%d, instance:%d }\n",
+                       class, instance);
+             return -EINVAL;
+     }
+
+     if (get_user(num_bonds, &ext->num_bonds))
+             return -EFAULT;

Should num_bonds > virtual->num_siblings be an error?

They could specify the same bond multiple times for whatever reason (and
probably should allow skipping NONE?), if the target doesn't exist that's
definitely an error.

So which bond we pick if they specify multiple ones? Just the first one
found. Hm actually I was thinking about making sure each master is only
specified once, not siblings. For siblings we indeed do not care.

No, it's a mask of if parent executes on master, use this set of
children.

I was reasonably happy to use a cumulative mask if master is specified
by more than one bond ext; but maybe it should be an intersection. Hmm.

Do you see a realistic and making sense use case for specifying the same 
master in multiple bonds? If not I'd just disallow it and then we don't 
have a question of union vs intersection policy.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx