Re: [PATCH v5 7/7] drm/i915: Expose RPCS (SSEU) configuration to userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/05/18 17:00, Tvrtko Ursulin wrote:

On 21/05/2018 14:22, Lionel Landwerlin wrote:
On 15/05/18 10:05, Tvrtko Ursulin wrote:

On 14/05/2018 16:56, Lionel Landwerlin wrote:
From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>

We want to allow userspace to reconfigure the subslice configuration for
its own use case. To do so, we expose a context parameter to allow
adjustment of the RPCS register stored within the context image (and
currently not accessible via LRI). If the context is adjusted before
first use, the adjustment is for "free"; otherwise if the context is
active we flush the context off the GPU (stalling all users) and forcing
the GPU to save the context to memory where we can modify it and so
ensure that the register is reloaded on next execution.

The overhead of managing additional EU subslices can be significant,
especially in multi-context workloads. Non-GPGPU contexts should
preferably disable the subslices it is not using, and others should
fine-tune the number to match their workload.

We expose complete control over the RPCS register, allowing
configuration of slice/subslice, via masks packed into a u64 for
simplicity. For example,

    struct drm_i915_gem_context_param arg;
    struct drm_i915_gem_context_param_sseu sseu = { .class = 0,
.instance = 0, };

    memset(&arg, 0, sizeof(arg));
    arg.ctx_id = ctx;
    arg.param = I915_CONTEXT_PARAM_SSEU;
    arg.value = (uintptr_t) &sseu;
    if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &arg) == 0) {
        sseu.packed.subslice_mask = 0;

        drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &arg);
    }

could be used to disable all subslices where supported.

v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel)

v3: Add ability to program this per engine (Chris)

v4: Move most get_sseu() into i915_gem_context.c (Lionel)

v5: Validate sseu configuration against the device's capabilities (Lionel)

v6: Change context powergating settings through MI_SDM on kernel context (Chris)

[snip]


+ intel_sseu_from_device_sseu(&INTEL_INFO(dev_priv)->sseu) :
+        sseu;
+
+    ret = engine->emit_rpcs_config(rq, ctx, actual_sseu);
+    if (ret) {
+        __i915_request_add(rq, true);
+        return ret;
+    }
+
+    /* Queue this switch after all other activity */
+    list_for_each_entry(timeline, &dev_priv->gt.timelines, link) {

This can iterate over gt.active_rings for a shorter walk. See current state of engine_has_idle_kernel_context.

Thanks.


+        struct i915_request *prev;
+
+        prev = last_request_on_engine(timeline, engine);
+        if (prev)
+ i915_sw_fence_await_sw_fence_gfp(&rq->submit,
+                             &prev->submit,
+                             I915_FENCE_GFP);
+    }
+
+    __i915_request_add(rq, true);

This is actually the bit I reading the patch for. So this I think is much better/safer than previous idling. However one concern, and maybe not a concern but just something which needs to be explained in the uAPI, is what it means with regards to the point from which the new configuration becomes live.

To me it's seems that all context ioctl effects are ordered.
So I would expect any execbuf after this ioctl would have the right configuration.
Anything before would have the previous configuration.

If that's your understanind too, I can add that in the documentation.

I guess I am missing what prevents the context which issued the SSEU re-configuraiton to run before the re-configuration request executes. They are on separate timelines and subsequent execbuf on the issuing context won't depend on it.


Could preemption for instance make it not defined enough? Could we, or should we, also link this request somehow onto the issuing context timeline so it must be first there? Hm, should we use the user context instead of the kernel one, and set the highest priority? But would have to avoid triggering preemption.

My understanding is that a context can't MI_LRI itself before Gen11.
I haven't tested that on Gen11 though.

I'm afraid that leaves us with only a max priority request tied to all the previous requests. Or somehow keep a pointer to the last request to change the context powergating configuration and add a dependency on all new request until it's retired?

I lost the train of dependencies here. :) [snip]

Sorry, I was getting confused by reading Chris' reply at the same time.

There is a problem with preemption and since the documentation tells us that we can't MI_LRI on the context itself (like Chris suggested and what would be really simpler), we have to go with an MI_SDM which this patch currently lacks synchronization for.



+
+        param_sseu.slice_mask = ce->sseu.slice_mask;
+        param_sseu.subslice_mask = ce->sseu.subslice_mask;
+        param_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice; +        param_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
+
+        if (copy_to_user(u64_to_user_ptr(args->value), &param_sseu,
+                 sizeof(param_sseu)))
+            ret = -EFAULT;
+        break;
+    }
      default:
          ret = -EINVAL;
          break;
@@ -845,7 +980,41 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
                  ctx->sched.priority = priority;
          }
          break;
+    case I915_CONTEXT_PARAM_SSEU:
+        {
+            struct drm_i915_private *dev_priv = to_i915(dev);
+            struct drm_i915_gem_context_param_sseu user_sseu;
+            struct intel_engine_cs *engine;
+            union intel_sseu ctx_sseu;
+
+            if (args->size) {
+                ret = -EINVAL;
+                break;
+            }
+
+            if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value),
+                       sizeof(user_sseu))) {
+                ret = -EFAULT;
+                break;
+            }
+
+            engine = intel_engine_lookup_user(dev_priv,
+                              user_sseu.class,
+                              user_sseu.instance);
+            if (!engine) {
+                ret = -EINVAL;
+                break;
+            }
+
+            ret = intel_sseu_from_user_sseu(&INTEL_INFO(dev_priv)->sseu,
+                            &user_sseu, &ctx_sseu);

Should setter have a helper as well?

I'm not sure what you're asking here :(
Setter for what?

set_param vs get_param. I only noticed you have helper to conver struct sseu from user to kernel version for one direction only. It could be it doesn't make sense to have the other, haven't looked that deeply.

Ah yeah, for the getter.

It's just that the setter does some error checking that the getter doesn't need to.
It makes the switch() { ... } a bit less verbose.


Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux