On 1/12/2025 21:40, Jiang Liu wrote:
When GPU suspend is aborted, do the same for dGPU as APU to reset soc15 asic. Otherwise it may cause following errors: [ 547.229463] amdgpu 0001:81:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110) [ 555.126827] amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110) [ 555.126901] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed [ 555.126957] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_4_3> failed -110 [ 555.126959] amdgpu 0000:0a:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). [ 555.126965] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -110 [ 555.126966] PM: Device 0000:0a:00.0 failed to resume async: error -110 This fix has been tested on Mi308X. Signed-off-by: Jiang Liu <gerry@xxxxxxxxxxxxxxxxx> Tested-by: Shuo Liu <shuox.liu@xxxxxxxxxxxxxxxxx>
Reviewed-by: Mario Limonciello <mario.limonciello@xxxxxxx> I'll get this committed to amd-staging-drm-next, thanks.
--- drivers/gpu/drm/amd/amdgpu/soc15.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index a59b4c36cad7..0e1daefd1a8e 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -605,12 +605,10 @@ soc15_asic_reset_method(struct amdgpu_device *adev) static bool soc15_need_reset_on_resume(struct amdgpu_device *adev) { /* Will reset for the following suspend abort cases. - * 1) Only reset on APU side, dGPU hasn't checked yet. - * 2) S3 suspend aborted in the normal S3 suspend or - * performing pm core test. + * 1) S3 suspend aborted in the normal S3 suspend + * 2) S3 suspend aborted in performing pm core test. */ - if (adev->flags & AMD_IS_APU && adev->in_s3 && - !pm_resume_via_firmware()) + if (adev->in_s3 && !pm_resume_via_firmware()) return true; else return false;