[PATCH 3/5] drm/i915: Increase busyspin limit before a context-switch

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Sat, 28 Jul 2018 17:46:21 +0100

Looking at the distribution of i915_wait_request for a set of GL
benchmarks, we see:

broadwell# python bcc/tools/funclatency.py -u i915_wait_request
   usecs               : count     distribution
       0 -> 1          : 29184    |****************************************|
       2 -> 3          : 5767     |*******                                 |
       4 -> 7          : 3000     |****                                    |
       8 -> 15         : 491      |                                        |
      16 -> 31         : 140      |                                        |
      32 -> 63         : 203      |                                        |
      64 -> 127        : 543      |                                        |
     128 -> 255        : 881      |*                                       |
     256 -> 511        : 1209     |*                                       |
     512 -> 1023       : 1739     |**                                      |
    1024 -> 2047       : 22855    |*******************************         |
    2048 -> 4095       : 1725     |**                                      |
    4096 -> 8191       : 5813     |*******                                 |
    8192 -> 16383      : 5348     |*******                                 |
   16384 -> 32767      : 1000     |*                                       |
   32768 -> 65535      : 4400     |******                                  |
   65536 -> 131071     : 296      |                                        |
  131072 -> 262143     : 225      |                                        |
  262144 -> 524287     : 4        |                                        |
  524288 -> 1048575    : 1        |                                        |
 1048576 -> 2097151    : 1        |                                        |
 2097152 -> 4194303    : 1        |                                        |

broxton# python bcc/tools/funclatency.py -u i915_wait_request
   usecs               : count     distribution
       0 -> 1          : 5523     |*************************************   |
       2 -> 3          : 1340     |*********                               |
       4 -> 7          : 2100     |**************                          |
       8 -> 15         : 755      |*****                                   |
      16 -> 31         : 211      |*                                       |
      32 -> 63         : 53       |                                        |
      64 -> 127        : 71       |                                        |
     128 -> 255        : 113      |                                        |
     256 -> 511        : 262      |*                                       |
     512 -> 1023       : 358      |**                                      |
    1024 -> 2047       : 1105     |*******                                 |
    2048 -> 4095       : 848      |*****                                   |
    4096 -> 8191       : 1295     |********                                |
    8192 -> 16383      : 5894     |****************************************|
   16384 -> 32767      : 4270     |****************************            |
   32768 -> 65535      : 5622     |**************************************  |
   65536 -> 131071     : 306      |**                                      |
  131072 -> 262143     : 50       |                                        |
  262144 -> 524287     : 76       |                                        |
  524288 -> 1048575    : 34       |                                        |
 1048576 -> 2097151    : 0        |                                        |
 2097152 -> 4194303    : 1        |                                        |

Picking 20us for the context-switch busyspin has the dual advantage of
catching most frequent short waits while avoiding the cost of a context
switch. 20us is a typical latency of 2 context-switches, i.e. the cost
of taking the sleep, without the secondary effects of cache flushing.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Sagar Kamble <sagar.a.kamble@xxxxxxxxx>
Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Ben Widawsky <ben@xxxxxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
---
 drivers/gpu/drm/i915/Kconfig.profile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 63cb744d920d..de394dea4a14 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -14,7 +14,7 @@ config DRM_I915_SPIN_REQUEST_IRQ
 
 config DRM_I915_SPIN_REQUEST_CS
 	int
-	default 2 # microseconds
+	default 20 # microseconds
 	help
 	  After sleeping for a request (GPU operation) to complete, we will
 	  be woken up on the completion of every request prior to the one
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx