On 10/26/2018 04:05 AM, Christian König wrote: > Am 25.10.18 um 22:16 schrieb Andrey Grodzovsky: >> Problem: After GPU reset on dGPUs with gfx8 compute ring >> 1.0.0 fails to pass the ring test. Ring registers inspection >> shows that it's active and no hang is observed (rptr == wptr) >> No significant diffs were observed between CP_HQD* registers >> for the ring in good and bad shape. >> >> Fix: No clear reason why but reversing the order of ring tests >> fixes the problem. >> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> > > Mhm, maybe try adding a delay before the ring test? First thing I tried, didn't help. > > Could be that the rings are started in reverse order as well and for > some reason the first one is start tested to quickly after a reset. No, KCQ queues mapping just before the test goes in 0..max order. Andrey > > Anyway patch is Acked-by: Christian König <christian.koenig@xxxxxxx> > > Thanks, > Christian. > >> --- >> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c >> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c >> index b2e1376..02f8ca5 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c >> @@ -4811,8 +4811,10 @@ static int gfx_v8_0_kcq_resume(struct >> amdgpu_device *adev) >> if (r) >> goto done; >> - /* Test KCQs */ >> - for (i = 0; i < adev->gfx.num_compute_rings; i++) { >> + /* Test KCQs - reversing the order of rings seems to fix ring >> test failure >> + * after GPU reset >> + */ >> + for (i = adev->gfx.num_compute_rings - 1; i >= 0; i--) { >> ring = &adev->gfx.compute_ring[i]; >> r = amdgpu_ring_test_helper(ring); >> } > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx