On 10/12/2023 03:21, Tvrtko Ursulin wrote:
On 21/09/2023 19:20, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>
If an active context has been banned (e.g. Ctrl+C killed) then it is
likely to be reset as part of evicting it from the hardware. That
results in a 'ignoring context reset notification: banned = 1'
message at info level. This confuses/concerns people and makes them
thing something has gone wrong when it hasn't.
Noticed the "confuses/concerns people" part while preparing the 6.7
pull request, and the fact there is no Fixes: tag. Is this something
that would be worth sending to stable (manually and if yes could you
do that please? If there were actual user bugs filed I guess.
No upstream bugs that I am aware of. There were very occasional
concerned emails from internal test teams (E2E and such rather than
kernel) and I think one internal bug was logged about it being seen when
running some automated user interaction stress test thing (monkey runner
or similar). So not sure that it is worth the effort of a backport to
older trees. And you can't really call it a bug with an older patch. The
message was never an error or even a warning, just an info level.
John.
Regards,
Tvrtko
There is already a debug level message with essentially the same
information. So drop the 'ignore' info level one and just add the
'ignore' flag to the debug level one instead (which will therefore not
appear by default but will still show up in CI runs).
Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx>
---
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index cabdc645fcddb..da7331346df1f 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4770,19 +4770,19 @@ static void guc_context_replay(struct
intel_context *ce)
static void guc_handle_context_reset(struct intel_guc *guc,
struct intel_context *ce)
{
+ bool capture = intel_context_is_schedulable(ce);
+
trace_intel_context_reset(ce);
- guc_dbg(guc, "Got context reset notification: 0x%04X on %s,
exiting = %s, banned = %s\n",
+ guc_dbg(guc, "%s context reset notification: 0x%04X on %s,
exiting = %s, banned = %s\n",
+ capture ? "Got" : "Ignoring",
ce->guc_id.id, ce->engine->name,
str_yes_no(intel_context_is_exiting(ce)),
str_yes_no(intel_context_is_banned(ce)));
- if (likely(intel_context_is_schedulable(ce))) {
+ if (capture) {
capture_error_state(guc, ce);
guc_context_replay(ce);
- } else {
- guc_info(guc, "Ignoring context reset notification of
exiting context 0x%04X on %s",
- ce->guc_id.id, ce->engine->name);
}
}