On 2021-12-10 11:19 a.m., Quan, Evan
wrote:
[JZ] Then what is concern for vcn1 not doing the same thing as vcn_v2_0_stop?[AMD Official Use Only]-----Original Message----- From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> Sent: Friday, December 10, 2021 8:25 PM To: Gong, Curry <Curry.Gong@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhu, James <James.Zhu@xxxxxxx>; Liu, Leo <Leo.Liu@xxxxxxx>; Quan, Evan <Evan.Quan@xxxxxxx> Subject: Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled On 12/10/2021 5:11 PM, chen gong wrote:Play a video on the raven (or PCO, raven2) platform, and then do the S3 test. When resume, the following error will be reported: amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume ofIPblock <vcn_v1_0> failed -110 amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 [why] When playing the video: The power state flag of the vcn block is set to POWER_STATE_ON. When doing suspend: There is no change to the power state flag of the vcn block, it is still POWER_STATE_ON. When doing resume: Need to open the power gate of the vcn block and set the power state flag of the VCN block to POWER_STATE_ON. But at this time, the power state flag of the vcn block is already POWER_STATE_ON. The power status flag check in the "8f2cdefdrm/amd/pm:avoid duplicate powergate/ungate setting" patch will return the amdgpu_dpm_set_powergating_by_smu function directly. As a result, the gate of the power was not opened, causing the subsequent ring test to fail. [how] In the suspend function of the vcn block, explicitly change the power state flag of the vcn block to POWER_STATE_OFF. Signed-off-by: chen gong <curry.gong@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c index d54d720..d73676b 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c @@ -246,6 +246,13 @@ static int vcn_v1_0_suspend(void *handle) { int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; + bool cancel_success; + + cancel_success = cancel_delayed_work_sync(&adev- vcn.idle_work); + if (cancel_success) { + if (adev->pm.dpm_enabled) + amdgpu_dpm_enable_uvd(adev, false); + }Probably this is a common issue. Can you try moving this to amdgpu_vcn_suspend?[Quan, Evan] Yes, amdgpu_vcn_suspend seems a more proper place. However, the vcn code is not consistent. For vcn v2 and later, they already do the manual gate operation in their suspend routine(like vcn_v2_0_stop). So, it is actually only vcn_v1_0.c suffers this issue.
if (adev->pm.dpm_enabled) amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,AMD_PG_STATE_GATE); Call this after cancel_delayed_work_sync. Shouldn't have any effect if idle work already put it in PG state. Evan, what do you think?[Quan, Evan] Should not for now. But I'm considering to raise the dev_dbg prompt in amdgpu_dpm_set_powergating_by_smu for double gate/ungate to dev_info or dev_warn. Hopefully that can help to capture some potential issues like this. So, better to keep the check for the return value of cancel_delayed_work_sync here. BR EvanThanks, Lijor = vcn_v1_0_hw_fini(adev); if (r)