On 11/12/2018 10:14, Ankit Navik wrote:
From: Praveen Diwakar <praveen.diwakar@xxxxxxxxx>
This patch gives us the active pending request count which is yet
to be submitted to the GPU
V2:
* Change 64-bit to atomic for request count. (Tvrtko Ursulin)
V3:
* Remove mutex for request count.
* Rebase.
* Fixes hitting underflow for predictive request. (Tvrtko Ursulin)
Cc: Aravindan Muthukumar <aravindan.muthukumar@xxxxxxxxx>
Cc: Kedar J Karanje <kedar.j.karanje@xxxxxxxxx>
Cc: Yogesh Marathe <yogesh.marathe@xxxxxxxxx>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
No, I did not tag this with r-b and you are not allowed to do this!!
Signed-off-by: Praveen Diwakar <praveen.diwakar@xxxxxxxxx>
Signed-off-by: Ankit Navik <ankit.p.navik@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_gem_context.c | 1 +
drivers/gpu/drm/i915/i915_gem_context.h | 5 +++++
drivers/gpu/drm/i915/i915_request.c | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 2 ++
4 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b10770c..0bcbe32 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
}
trace_i915_context_create(ctx);
+ atomic_set(&ctx->req_cnt, 0);
return ctx;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index b116e49..e824b15 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -194,6 +194,11 @@ struct i915_gem_context {
* context close.
*/
struct list_head handles_list;
+
+ /** req_cnt: tracks the pending commands, based on which we decide to
+ * go for low/medium/high load configuration of the GPU.
+ */
+ atomic_t req_cnt;
};
static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 5c2c93c..b90795a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1113,6 +1113,8 @@ void i915_request_add(struct i915_request *request)
}
request->emitted_jiffies = jiffies;
+ atomic_inc(&request->gem_context->req_cnt);
+
/*
* Let the backend know a new request has arrived that may need
* to adjust the existing execution schedule due to a high priority
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1744792..d33f5ac 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1127,6 +1127,8 @@ static void execlists_submit_request(struct i915_request *request)
submit_queue(engine, rq_prio(request));
spin_unlock_irqrestore(&engine->timeline.lock, flags);
+
+ atomic_dec(&request->gem_context->req_cnt);
}
static struct i915_request *sched_to_request(struct i915_sched_node *node)
With such placement of accounting you are only considering requests
which are not yet runnable (due fences and implicit dependencies). If on
the contrary everything is runnable, and there is a lot of it waiting
for the GPU to execute it, this counter will show zero. And you'll
decide to run in a reduced slice/EU configuration. There has to be some
benchmarks which shows the adverse effect of this, you just haven't
found it yet I guess.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx