Problem: After GPU reset on dGPUs with gfx8 compute ring 1.0.0 fails to pass the ring test. Ring registers inspection shows that it's active and no hang is observed (rptr == wptr) No significant diffs were observed between CP_HQD* registers for the ring in good and bad shape. Fix: No clear reason why but reversing the order of ring tests fixes the problem. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index b2e1376..02f8ca5 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -4811,8 +4811,10 @@ static int gfx_v8_0_kcq_resume(struct amdgpu_device *adev) if (r) goto done; - /* Test KCQs */ - for (i = 0; i < adev->gfx.num_compute_rings; i++) { + /* Test KCQs - reversing the order of rings seems to fix ring test failure + * after GPU reset + */ + for (i = adev->gfx.num_compute_rings - 1; i >= 0; i--) { ring = &adev->gfx.compute_ring[i]; r = amdgpu_ring_test_helper(ring); } -- 2.7.4 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx