Re: [PATCH 5/7] drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 08/01/2019 14:22, Joonas Lahtinen wrote:
Quoting Tvrtko Ursulin (2019-01-08 13:22:50)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

We want to allow userspace to reconfigure the subslice configuration on a
per context basis.

This is required for the functional requirement of shutting down non-VME
enabled sub-slices on Gen11 parts.

To do so, we expose a context parameter to allow adjustment of the RPCS
register stored within the context image (and currently not accessible via
LRI).

If the context is adjusted before first use or whilst idle, the adjustment
is for "free"; otherwise if the context is active we queue a request to do
so (using the kernel context), following all other activity by that
context, which is also marked as barrier for all following submission
against the same context.

Since the overhead of device re-configuration during context switching can
be significant, especially in multi-context workloads, we limit this new
uAPI to only support the Gen11 VME use case. In this use case either the
device is fully enabled, and exactly one slice and half of the subslices
are enabled.

Example usage:

         struct drm_i915_gem_context_param_sseu sseu = { };
         struct drm_i915_gem_context_param arg =
                 { .param = I915_CONTEXT_PARAM_SSEU,
                   .ctx_id = gem_context_create(fd),
                   .size = sizeof(sseu),
                   .value = to_user_pointer(&sseu)
                 };

         /* Query device defaults. */
         gem_context_get_param(fd, &arg);

         /* Set VME configuration on a 1x6x8 part. */
         sseu.slice_mask = 0x1;
         sseu.subslice_mask = 0xe0;
         gem_context_set_param(fd, &arg);

v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel)

v3: Add ability to program this per engine (Chris)

v4: Move most get_sseu() into i915_gem_context.c (Lionel)

v5: Validate sseu configuration against the device's capabilities (Lionel)

v6: Change context powergating settings through MI_SDM on kernel context (Chris)

v7: Synchronize the requests following a powergating setting change using a global
     dependency (Chris)
     Iterate timelines through dev_priv.gt.active_rings (Tvrtko)
     Disable RPCS configuration setting for non capable users (Lionel/Tvrtko)

v8: s/union intel_sseu/struct intel_sseu/ (Lionel)
     s/dev_priv/i915/ (Tvrtko)
     Change uapi class/instance fields to u16 (Tvrtko)
     Bump mask fields to 64bits (Lionel)
     Don't return EPERM when dynamic sseu is disabled (Tvrtko)

v9: Import context image into kernel context's ppgtt only when
     reconfiguring powergated slice/subslices (Chris)
     Use aliasing ppgtt when needed (Michel)

Tvrtko Ursulin:

v10:
  * Update for upstream changes.
  * Request submit needs a RPM reference.
  * Reject on !FULL_PPGTT for simplicity.
  * Pull out get/set param to helpers for readability and less indent.
  * Use i915_request_await_dma_fence in add_global_barrier to skip waits
    on the same timeline and avoid GEM_BUG_ON.
  * No need to explicitly assign a NULL pointer to engine in legacy mode.
  * No need to move gen8_make_rpcs up.
  * Factored out global barrier as prep patch.
  * Allow to only CAP_SYS_ADMIN if !Gen11.

v11:
  * Remove engine vfunc in favour of local helper. (Chris Wilson)
  * Stop retiring requests before updates since it is not needed
    (Chris Wilson)
  * Implement direct CPU update path for idle contexts. (Chris Wilson)
  * Left side dependency needs only be on the same context timeline.
    (Chris Wilson)
  * It is sufficient to order the timeline. (Chris Wilson)
  * Reject !RCS configuration attempts with -ENODEV for now.

v12:
  * Rebase for make_rpcs.

v13:
  * Centralize SSEU normalization to make_rpcs.
  * Type width checking (uAPI <-> implementation).
  * Gen11 restrictions uAPI checks.
  * Gen11 subslice count differences handling.
  Chris Wilson:
  * args->size handling fixes.
  * Update context image from GGTT.
  * Postpone context image update to pinning.
  * Use i915_gem_active_raw instead of last_request_on_engine.

v14:
  * Add activity tracker on intel_context to fix the lifetime issues
    and simplify the code. (Chris Wilson)

v15:
  * Fix context pin leak if no space in ring by simplifying the
    context pinning sequence.

v16:
  * Rebase for context get/set param locking changes.
  * Just -ENODEV on !Gen11. (Joonas)

v17:
  * Fix one Gen11 subslice enablement rule.
  * Handle error from i915_sw_fence_await_sw_fence_gfp. (Chris Wilson)

v18:
  * Update commit message. (Joonas)
  * Restrict uAPI to VME use case. (Joonas)

v19:
  * Rebase.

v20:
  * Rebase for ce->active_tracker.

v21:
  * Rebase for IS_GEN changes.

v22:
  * Reserve uAPI for flags straight away. (Chris Wilson)

v23:
  * Rebase for RUNTIME_INFO.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107634
Issue: https://github.com/intel/media-driver/issues/267
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Zhipeng Gong <zhipeng.gong@xxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> # v21

<SNIP>

+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2377,7 +2377,9 @@ u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu)
          * subslices are enabled, or a count between one and four on the first
          * slice.
          */
-       if (IS_GEN(i915, 11) && slices == 1 && subslices >= 4) {
+       if (IS_GEN(i915, 11) &&
+           slices == 1 &&
+           subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {

Could use it's own variable max_subslices = min_t(...).

The value is not used elsewhere so I presume you mean for readability? like:

if (IS_GEN(i915, 11) && slices == 1) {
	u8 max_subslices = min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2);

	if (subslices > max_subslices) {
		GEM_BUG_ON(subslices & 1);
		subslice_pg = false;
		slices *= 2;
	}
}

?

I am not too bothered with one vs the other so depends how strong you suggest.

+++ b/include/uapi/drm/i915_drm.h
@@ -1486,9 +1486,52 @@ struct drm_i915_gem_context_param {
  #define   I915_CONTEXT_MAX_USER_PRIORITY       1023 /* inclusive */
  #define   I915_CONTEXT_DEFAULT_PRIORITY                0
  #define   I915_CONTEXT_MIN_USER_PRIORITY       -1023 /* inclusive */
+       /*
+        * When using the following param, value should be a pointer to
+        * drm_i915_gem_context_param_sseu.
+        */
+#define I915_CONTEXT_PARAM_SSEU                0x7
         __u64 value;
  };

Maybe we should amend some comments?

/*
  * NOTE: Can currently only be used to switch between VME enabled
  *       slice configuration vs. full on Icelake (Gen11)
  *
  * NOTE: Slice configuration requests are ignored when perf is enabled.
  */

At first I thought a good idea but on second thought do we want to put such implementation details into uapi headers? Second note maybe, but first I have a feeling is best left out since headers and kernel are not strictly tied up in deployment. Don't know, third opinion from Chris?

Regards,

Tvrtko


Reviewed-by: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>

Regards, Joonas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux