Re: [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request

Sagar Arun Kamble <sagar.a.kamble@xxxxxxxxx> · Mon, 27 Nov 2017 12:55:05 +0530

On 11/27/2017 12:50 PM, Sagar Arun Kamble wrote:


On 11/26/2017 5:50 PM, Chris Wilson wrote:
An interesting discussion regarding "hybrid interrupt polling" for NVMe
came to the conclusion that the ideal busyspin before sleeping

I think hybrid approach suggests sleep (1/2 duration) and then 
busyspin (1/2 duration)
(for small I/O size), so this should be "busyspin after sleeping" 
although we are not doing exactly same.

  was half
of the expected request latency (and better if it was already halfway
through that request). This suggested that we too should look again at
our tradeoff between spinning and waiting. Currently, our spin simply
tries to hide the cost of enabling the interrupt, which is good to avoid
penalising nop requests (i.e. test throughput) and not much else.
Studying real world workloads suggests that a spin of upto 500us can
dramatically boost performance, but the suggestion is that this is not
from avoiding interrupt latency per-se, but from secondary effects of
sleeping such as allowing the CPU reduce cstate and context switch away.

v2: Expose the spin setting via Kconfig options for easier adjustment
and testing.
v3: Don't get caught sneaking in a change to the busyspin parameters.

Suggested-by: Sagar Kamble <sagar.a.kamble@xxxxxxxxx>
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Sagar Kamble <sagar.a.kamble@xxxxxxxxx>
Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Ben Widawsky <ben@xxxxxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
---
  drivers/gpu/drm/i915/Kconfig            |  6 ++++++
  drivers/gpu/drm/i915/Kconfig.profile    | 23 +++++++++++++++++++++++
  drivers/gpu/drm/i915/i915_gem_request.c | 28 
++++++++++++++++++++++++----
  3 files changed, 53 insertions(+), 4 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/Kconfig.profile

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index dfd95889f4b7..eae90783f8f9 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -131,3 +131,9 @@ depends on DRM_I915
  depends on EXPERT
  source drivers/gpu/drm/i915/Kconfig.debug
  endmenu
+
+menu "drm/i915 Profile Guided Optimisation"
+    visible if EXPERT
+    depends on DRM_I915
+    source drivers/gpu/drm/i915/Kconfig.profile
+endmenu
diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
b/drivers/gpu/drm/i915/Kconfig.profile
new file mode 100644
index 000000000000..a1aed0e2aad5
--- /dev/null
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -0,0 +1,23 @@
+config DRM_I915_SPIN_REQUEST_IRQ
+    int
+    default 5 # microseconds
+    help
+      Before sleeping waiting for a request (GPU operation) to 
complete,
+      we may spend some time polling for its completion. As the IRQ may
+      take a non-negligible time to setup, we do a short spin first to
+      check if the request will complete quickly.
+
+      May be 0 to disable the initial spin.
+
+config DRM_I915_SPIN_REQUEST_CS
+    int
+    default 2 # microseconds
+    help
+      After sleeping for a request (GPU operation) to complete, we will
+      be woken up on the completion of every request prior to the one
+      being waited on. For very short requests, going back to sleep and
+      be woken up again may add considerably to the wakeup latency. To
+      avoid incurring extra latency from the scheduler, we may 
choose to
+      spin prior to sleeping again.
+
+      May be 0 to disable spinning after being woken.
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..7ac72a0a949c 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -1198,8 +1198,21 @@ long i915_wait_request(struct 
drm_i915_gem_request *req,
      GEM_BUG_ON(!intel_wait_has_seqno(&wait));
      GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
  -    /* Optimistic short spin before touching IRQs */
-    if (__i915_spin_request(req, wait.seqno, state, 5))
+    /* Optimistic spin before touching IRQs.
+     *
+     * We may use a rather large value here to offset the penalty of
+     * switching away from the active task. Frequently, the client will
+     * wait upon an old swapbuffer to throttle itself to remain 
within a
+     * frame of the gpu. If the client is running in lockstep with 
the gpu,
+     * then it should not be waiting long at all, and a sleep now 
will incur
+     * extra scheduler latency in producing the next frame. So we sleep
+     * for longer to try and keep the client running.
+     *
+     * We need ~5us to enable the irq, ~20us to hide a context switch.

This comment fits more with next patch.

+     */
+    if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&

If this is set to 0 we still want to sleep and not go to complete.
Hit wicket :) ... Ignore this comment.
With above comments above commit message and comment update this change 
looks fine to me.
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@xxxxxxxxx>

+        __i915_spin_request(req, wait.seqno, state,
+                CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
          goto complete;
        set_current_state(state);
@@ -1255,8 +1268,15 @@ long i915_wait_request(struct 
drm_i915_gem_request *req,
              __i915_wait_request_check_and_reset(req))
              continue;
  -        /* Only spin if we know the GPU is processing this request */
-        if (__i915_spin_request(req, wait.seqno, state, 2))
+        /*
+         * A quick spin now we are on the CPU to offset the cost of
+         * context switching away (and so spin for roughly the same as
+         * the scheduler latency). We only spin if we know the GPU is
+         * processing this request, and so likely to finish shortly.
+         */
+        if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&

Same here.

+            __i915_spin_request(req, wait.seqno, state,
+                    CONFIG_DRM_I915_SPIN_REQUEST_CS))
              break;
            if (!intel_wait_check_request(&wait, req)) {

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx