Re: [PATCH] drm/amdgpu: limit GDS clearing workaround in cold boot sequence

Christian König <ckoenig.leichtzumerken@xxxxxxxxx> · Mon, 10 Feb 2020 10:21:55 +0100

Am 10.02.20 um 09:33 schrieb Guchun Chen:
GDS clear workaround will cause gfx failure in suspend/resume case.

[   98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <gfx_v9_0> failed -110
[   98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110
[   98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110

As this workaround is specific to the HW bug of GDS's ECC error
existing in cold boot up, so bypass this workaround in suspend/
resume case after booting up.

Mhm, why doesn't this also apply for suspend/resume?

I mean the hardware is usually turned off which is equivalent to a cold 
boot up?

Christian.


Signed-off-by: Guchun Chen <guchun.chen@xxxxxxx>
Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 9 ++++++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index de59defa91eb..33f282ff245f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4320,9 +4320,12 @@ static int gfx_v9_0_ecc_late_init(void *handle)
  	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  	int r;
  
-	r = gfx_v9_0_do_edc_gds_workarounds(adev);
-	if (r)
-		return r;
+	/* limit gds clearing operation in cold boot sequence */
+	if (!adev->in_suspend) {
+		r = gfx_v9_0_do_edc_gds_workarounds(adev);
+		if (r)
+			return r;
+	}
  
  	/* requires IBs so do in late init after IB pool is initialized */
  	r = gfx_v9_0_do_edc_gpr_workarounds(adev);

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx