On 11/08/2022 22:08, Daniele Ceraolo Spurio wrote:
If the GuC CTs are full and we need to stall the request submission
while waiting for space, we save the stalled request and where the stall
occurred; when the CTs have space again we pick up the request submission
from where we left off.
How serious is it? Statement always was CT buffers can never get full
outside the pathological IGT test cases. So I am wondering if this is in
the category of fix for correctness or actually the CT buffers can get
full in normal use so it is imperative to fix.
Regards,
Tvrtko
If a full GT reset occurs, the state of all contexts is cleared and all
non-guilty requests are unsubmitted, therefore we need to restart the
stalled request submission from scratch. To make sure that we do so,
clear the saved request after a reset.
Fixes note: the patch that introduced the bug is in 5.15, but no
officially supported platform had GuC submission enabled by default
in that kernel, so the backport to that particular version (and only
that one) can potentially be skipped.
Fixes: 925dc1cf58ed ("drm/i915/guc: Implement GuC submission tasklet")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx>
Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
Cc: John Harrison <john.c.harrison@xxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx> # v5.15+
---
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 0d17da77e787..0d56b615bf78 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4002,6 +4002,13 @@ static inline void guc_init_lrc_mapping(struct intel_guc *guc)
/* make sure all descriptors are clean... */
xa_destroy(&guc->context_lookup);
+ /*
+ * A reset might have occurred while we had a pending stalled request,
+ * so make sure we clean that up.
+ */
+ guc->stalled_request = NULL;
+ guc->submission_stall_reason = STALL_NONE;
+
/*
* Some contexts might have been pinned before we enabled GuC
* submission, so we need to add them to the GuC bookeeping.