On Thu, Apr 23, 2020 at 10:55 AM Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > Yeah, we certainly could try this again. But maybe split that up into > individual patches for gfx7/8/9. > > In other words make it easy to revert if this still doesn't work well on > gfx7 or some other generation. Yeah, unless there is a good reason, I don't think we should do this. IIRC, compute rings randomly fail to recover on a lot of hw generations. Alex > > Christian. > > Am 23.04.20 um 15:43 schrieb Zhang, Hawking: > > [AMD Official Use Only - Internal Distribution Only] > > > > Would you mind to enable this and try it again? The recent gpu reset testing on vega20 looks very positive. > > > > Regards, > > Hawking > > -----Original Message----- > > From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> > > Sent: Thursday, April 23, 2020 20:31 > > To: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > Subject: Re: [PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed > > > > Am 23.04.20 um 11:01 schrieb Hawking Zhang: > >> driver should stop cp resume once compute ring test failed > > Mhm intentionally ignored those errors because the compute rings sometimes doesn't come up again after a GPU reset. > > > > We even have the necessary logic in the SW scheduler to redirect the jobs to another compute ring when one fails to come up again. > > > > Christian. > > > >> Change-Id: I4cd3328f38e0755d0c877484936132d204c9fe50 > >> Signed-off-by: Hawking Zhang <Hawking.Zhang@xxxxxxx> > >> --- > >> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 4 +++- > >> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +++- > >> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +++- > >> 3 files changed, 9 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c > >> b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c > >> index b2f10e3..fcee758 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c > >> @@ -3132,7 +3132,9 @@ static int gfx_v7_0_cp_compute_resume(struct > >> amdgpu_device *adev) > >> > >> for (i = 0; i < adev->gfx.num_compute_rings; i++) { > >> ring = &adev->gfx.compute_ring[i]; > >> - amdgpu_ring_test_helper(ring); > >> + r = amdgpu_ring_test_helper(ring); > >> + if (r) > >> + return r; > >> } > >> > >> return 0; > >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > >> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > >> index 6c56ced..8dc8e90 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > >> @@ -4781,7 +4781,9 @@ static int gfx_v8_0_cp_test_all_rings(struct > >> amdgpu_device *adev) > >> > >> for (i = 0; i < adev->gfx.num_compute_rings; i++) { > >> ring = &adev->gfx.compute_ring[i]; > >> - amdgpu_ring_test_helper(ring); > >> + r = amdgpu_ring_test_helper(ring); > >> + if (r) > >> + return r; > >> } > >> > >> return 0; > >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > >> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > >> index 09aa5f5..20937059 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > >> @@ -3846,7 +3846,9 @@ static int gfx_v9_0_cp_resume(struct > >> amdgpu_device *adev) > >> > >> for (i = 0; i < adev->gfx.num_compute_rings; i++) { > >> ring = &adev->gfx.compute_ring[i]; > >> - amdgpu_ring_test_helper(ring); > >> + r = amdgpu_ring_test_helper(ring); > >> + if (r) > >> + return r; > >> } > >> > >> gfx_v9_0_enable_gui_idle_interrupt(adev, true); > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx