Re: [PATCH v3 1/4] drm/i915: Get active pending request for given context

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Tue, 11 Dec 2018 11:58:11 +0000

On 11/12/2018 10:14, Ankit Navik wrote:
From: Praveen Diwakar <praveen.diwakar@xxxxxxxxx>

This patch gives us the active pending request count which is yet
to be submitted to the GPU

V2:
  * Change 64-bit to atomic for request count. (Tvrtko Ursulin)

V3:
  * Remove mutex for request count.
  * Rebase.
  * Fixes hitting underflow for predictive request. (Tvrtko Ursulin)

Cc: Aravindan Muthukumar <aravindan.muthukumar@xxxxxxxxx>
Cc: Kedar J Karanje <kedar.j.karanje@xxxxxxxxx>
Cc: Yogesh Marathe <yogesh.marathe@xxxxxxxxx>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>

No, I did not tag this with r-b and you are not allowed to do this!!

Signed-off-by: Praveen Diwakar <praveen.diwakar@xxxxxxxxx>
Signed-off-by: Ankit Navik <ankit.p.navik@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_gem_context.c | 1 +
  drivers/gpu/drm/i915/i915_gem_context.h | 5 +++++
  drivers/gpu/drm/i915/i915_request.c     | 2 ++
  drivers/gpu/drm/i915/intel_lrc.c        | 2 ++
  4 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b10770c..0bcbe32 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
  	}
  
  	trace_i915_context_create(ctx);
+	atomic_set(&ctx->req_cnt, 0);
  
  	return ctx;
  }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index b116e49..e824b15 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -194,6 +194,11 @@ struct i915_gem_context {
  	 * context close.
  	 */
  	struct list_head handles_list;
+
+	/** req_cnt: tracks the pending commands, based on which we decide to
+	 * go for low/medium/high load configuration of the GPU.
+	 */
+	atomic_t req_cnt;
  };
  
  static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 5c2c93c..b90795a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1113,6 +1113,8 @@ void i915_request_add(struct i915_request *request)
  	}
  	request->emitted_jiffies = jiffies;
  
+	atomic_inc(&request->gem_context->req_cnt);
+
  	/*
  	 * Let the backend know a new request has arrived that may need
  	 * to adjust the existing execution schedule due to a high priority
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1744792..d33f5ac 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1127,6 +1127,8 @@ static void execlists_submit_request(struct i915_request *request)
  	submit_queue(engine, rq_prio(request));
  
  	spin_unlock_irqrestore(&engine->timeline.lock, flags);
+
+	atomic_dec(&request->gem_context->req_cnt);
  }
  
  static struct i915_request *sched_to_request(struct i915_sched_node *node)


With such placement of accounting you are only considering requests 
which are not yet runnable (due fences and implicit dependencies). If on 
the contrary everything is runnable, and there is a lot of it waiting 
for the GPU to execute it, this counter will show zero. And you'll 
decide to run in a reduced slice/EU configuration. There has to be some 
benchmarks which shows the adverse effect of this, you just haven't 
found it yet I guess.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx