On 10/10/2023 17:17, Andi Shyti wrote:
Hi Matt,
FIXME: CAT errors are cropping up on MTL. This removes them,
but the real root cause must still be diagnosed.
Do you have a link to specific IGT test(s) that illustrate the CAT
errors so that we can ensure that they now appear fixed in CI?
this one:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_124599v1/bat-mtlp-8/igt@i915_selftest@live@xxxxxxxxxxxxxx
Andi
Wait, now I'm confused. That's a failure caused by a different patch
series (one that we won't be moving forward with). The live@hugepages
test is always passing on drm-tip today:
https://intel-gfx-ci.01.org/tree/drm-tip/igt@i915_selftest@live@xxxxxxxxxxxxxx
yes, true, but that patch allows us to move forward with the
testing and hit the CAT error.
(it was the most reachable link I found :))
Is there a test that's giving CAT errors on drm-tip itself (even
sporadically) that we can monitor to see the impact of Jonathan's patch
here?
Otherwise this one:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13667/re-mtlp-3/igt@gem_exec_fence@xxxxxxxxxxxxx#dmesg-warnings11
Parachuting in on a tangent - please do not mix CAT and CT errors. CAT, for me at least, associates with CATastrophic faults reported over CT channel, like GuC page faulting IIRC.
For CT errors maybe GuC folks can sched some light what they mean.
Regards,
Tvrtko