Hi Janusz, On 2024-07-18 at 10:55:12 +0200, Janusz Krzysztofik wrote: > CI reports the following failures from basic-nohangcheck subtest: > > (gem_ctx_exec:1115) CRITICAL: Test assertion failure function nohangcheck_hostile, file ../../../usr/src/igt-gpu-tools/tests/intel/gem_ctx_exec.c:374: > (gem_ctx_exec:1115) CRITICAL: Failed assertion: err == 0 > (gem_ctx_exec:1115) CRITICAL: Last errno: 2, No such file or directory > (gem_ctx_exec:1115) CRITICAL: Hostile unpreemptable context was not cancelled immediately upon closure > > The subtest sets 50 ms preempt timeout on each engine before proceding > with submission of spins, then it waits up to 1 second for those spins to > be terminated. However, dump of engines' debugfs data performed by the > subtest after the failure shows preempt timeouts still at their default > values: 7500 ms on rcs0 and 640 ms on other class engines. Dmesg records > confirm preemption timeouts triggered on other engines after 640 ms and > not on rcs0 within the 1 second limit. > > As a first step, let the subtest verify return values of function calls > supposed to update the preempt timeouts with the new values. If failed > on any engine then report that at debug level as a useful hint displayed > when the test times out on waiting for spin termination. > > v2: No changes. > v3: Don't fail on unsuccessful update of preempt_timeout_ms, older > platforms don't support it but can still succeed. > > Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/6268 > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx> LGTM, Reviewed-by: Kamil Konieczny <kamil.konieczny@xxxxxxxxxxxxxxx> > --- > tests/intel/gem_ctx_exec.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/tests/intel/gem_ctx_exec.c b/tests/intel/gem_ctx_exec.c > index d6aa8ba0aa..f3e252d10e 100644 > --- a/tests/intel/gem_ctx_exec.c > +++ b/tests/intel/gem_ctx_exec.c > @@ -308,8 +308,7 @@ static void nohangcheck_hostile(int i915) > igt_hang_t hang; > int fence = -1; > const intel_ctx_t *ctx; > - int err = 0; > - int dir; > + int dir, err; > uint64_t ahnd; > > /* > @@ -333,8 +332,11 @@ static void nohangcheck_hostile(int i915) > int new; > > /* Set a fast hang detection for a dead context */ > - gem_engine_property_printf(i915, e->name, > - "preempt_timeout_ms", "%d", 50); > + err = gem_engine_property_printf(i915, e->name, > + "preempt_timeout_ms", "%d", 50); > + igt_debug_on_f(err < 0, > + "%s preempt_timeout_ms update failed: %d\n", > + e->name, err); > > spin = __igt_spin_new(i915, > .ahnd = ahnd, > @@ -362,6 +364,7 @@ static void nohangcheck_hostile(int i915) > intel_ctx_destroy(i915, ctx); > igt_assert(fence != -1); > > + err = 0; > if (sync_fence_wait(fence, MSEC_PER_SEC)) { /* 640ms preempt-timeout */ > igt_debugfs_dump(i915, "i915_engine_info"); > err = -ETIME; > -- > 2.45.2 >