On 05/04/2023 07:56, Upadhyay, Tejas wrote:
-int igt_live_test_end(struct igt_live_test *t)
+int igt_live_test_end(struct igt_live_test *t, struct intel_gt *gt)
{
- struct drm_i915_private *i915 = t->i915;
+ struct drm_i915_private *i915 = gt->i915;
struct intel_engine_cs *engine;
enum intel_engine_id id;
@@ -57,7 +57,7 @@ int igt_live_test_end(struct igt_live_test *t)
return -EIO;
}
- for_each_engine(engine, to_gt(i915), id) {
+ for_each_engine(engine, gt, id) {
if (t->reset_engine[id] ==
i915_reset_engine_count(&i915->gpu_error, engine))
continue;
diff --git a/drivers/gpu/drm/i915/selftests/igt_live_test.h
b/drivers/gpu/drm/i915/selftests/igt_live_test.h
index 36ed42736c52..209b0548c603 100644
--- a/drivers/gpu/drm/i915/selftests/igt_live_test.h
+++ b/drivers/gpu/drm/i915/selftests/igt_live_test.h
@@ -27,9 +27,9 @@ struct igt_live_test {
* e.g. if the GPU was reset.
*/
int igt_live_test_begin(struct igt_live_test *t,
- struct drm_i915_private *i915,
+ struct intel_gt *gt,
const char *func,
const char *name);
-int igt_live_test_end(struct igt_live_test *t);
+int igt_live_test_end(struct igt_live_test *t, struct intel_gt *gt);
Back in the day the plan was that live selftests are device focused and then
we also have intel_gt_live_subtests, which are obviously GT focused. So in
that sense adding a single GT parameter to igt_live_test_begin isn't
something I immediately understand.
Could you explain in one or two practical examples what is not working
properly and how is this patch fixing it?
For example you are running test "live_all_engines(void *arg)",
-- Below test begin, will reset counters for primary GT - Tile0 --
err = igt_live_test_begin(&t, to_gt(i915), __func__, "");
if (err)
goto out_free;
--- Now we loop for all engines, note here for MTL vcs, vecs engines are not on primary GT or tile 0,
So counters did not reset on test begin does not cover them. ---
In test_begin, below will not reset count for vcs, vecs engines on MTL,
for_each_engine(engine, gt, id)
t->reset_engine[id] =
i915_reset_engine_count(&i915->gpu_error, engine);
--- Then below will end test, again for primary GT where above mentioned engines are not there ---
err = igt_live_test_end(&t, to_gt(i915));
In short to me it looks like igt_live_test for device needs attention when we have different engines on different GTs like MTL.
If you either add for_each_gt to that for_each_engine in
igt_live_test_begin and igt_live_test_end, or alternatively
for_each_uabi_engine (maybe misses some internal engines?), everything
works? That would be much smaller patch and wouldn't falsely associate
live tests with a single gt.
Regards,
Tvrtko