On Wed, Mar 13, 2024 at 4:12 PM Felix Kuehling <felix.kuehling@xxxxxxx> wrote: > > On 2024-03-11 11:14, Sasha Levin wrote: > > From: Prike Liang <Prike.Liang@xxxxxxx> > > > > [ Upstream commit c671ec01311b4744b377f98b0b4c6d033fe569b3 ] > > > > Currently, GPU resets can now be performed successfully on the Raven > > series. While GPU reset is required for the S3 suspend abort case. > > So now can enable gpu reset for S3 abort cases on the Raven series. > > This looks suspicious to me. I'm not sure what conditions made the GPU > reset successful. But unless all the changes involved were also > backported, this should probably not be applied to older kernel > branches. I'm speculating it may be related to the removal of AMD IOMMUv2. > We should get confirmation from Prike, but I think he tested this on older kernels as well. Alex > Regards, > Felix > > > > > > Signed-off-by: Prike Liang <Prike.Liang@xxxxxxx> > > Acked-by: Alex Deucher <alexander.deucher@xxxxxxx> > > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> > > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/soc15.c | 45 +++++++++++++++++------------- > > 1 file changed, 25 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c > > index 6a3486f52d698..ef5b3eedc8615 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > @@ -605,11 +605,34 @@ soc15_asic_reset_method(struct amdgpu_device *adev) > > return AMD_RESET_METHOD_MODE1; > > } > > > > +static bool soc15_need_reset_on_resume(struct amdgpu_device *adev) > > +{ > > + u32 sol_reg; > > + > > + sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); > > + > > + /* Will reset for the following suspend abort cases. > > + * 1) Only reset limit on APU side, dGPU hasn't checked yet. > > + * 2) S3 suspend abort and TOS already launched. > > + */ > > + if (adev->flags & AMD_IS_APU && adev->in_s3 && > > + !adev->suspend_complete && > > + sol_reg) > > + return true; > > + > > + return false; > > +} > > + > > static int soc15_asic_reset(struct amdgpu_device *adev) > > { > > /* original raven doesn't have full asic reset */ > > - if ((adev->apu_flags & AMD_APU_IS_RAVEN) || > > - (adev->apu_flags & AMD_APU_IS_RAVEN2)) > > + /* On the latest Raven, the GPU reset can be performed > > + * successfully. So now, temporarily enable it for the > > + * S3 suspend abort case. > > + */ > > + if (((adev->apu_flags & AMD_APU_IS_RAVEN) || > > + (adev->apu_flags & AMD_APU_IS_RAVEN2)) && > > + !soc15_need_reset_on_resume(adev)) > > return 0; > > > > switch (soc15_asic_reset_method(adev)) { > > @@ -1490,24 +1513,6 @@ static int soc15_common_suspend(void *handle) > > return soc15_common_hw_fini(adev); > > } > > > > -static bool soc15_need_reset_on_resume(struct amdgpu_device *adev) > > -{ > > - u32 sol_reg; > > - > > - sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); > > - > > - /* Will reset for the following suspend abort cases. > > - * 1) Only reset limit on APU side, dGPU hasn't checked yet. > > - * 2) S3 suspend abort and TOS already launched. > > - */ > > - if (adev->flags & AMD_IS_APU && adev->in_s3 && > > - !adev->suspend_complete && > > - sol_reg) > > - return true; > > - > > - return false; > > -} > > - > > static int soc15_common_resume(void *handle) > > { > > struct amdgpu_device *adev = (struct amdgpu_device *)handle;