Re: [PATCH 2/2] drm/amdgpu: reset gpu for pm abort case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 24, 2024 at 2:17 AM Prike Liang <Prike.Liang@xxxxxxx> wrote:
>
> In the pm abort case the gfx power rail not turn off from FCH side and
> this will lead to the gfx reinitialized failed base on the unknown gfx
> HW status, so let's reset the gpu to a known good power state.
>
> Signed-off-by: Prike Liang <Prike.Liang@xxxxxxx>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 5 +++++
>  drivers/gpu/drm/amd/amdgpu/soc15.c      | 7 ++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 9153f69bad7f..14b66c49a536 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -2935,6 +2935,11 @@ static int psp_resume(void *handle)
>
>         dev_info(adev->dev, "PSP is resuming...\n");
>
> +       if(amdgpu_asic_need_reset_on_init(adev)) {
> +               DRM_INFO("PM abort case and let's reset asic \n");
> +               amdgpu_asic_reset(adev);
> +       }
> +

Seems like it would be better to put this in the resume callback.

>         if (psp->mem_train_ctx.enable_mem_training) {
>                 ret = psp_mem_training(psp, PSP_MEM_TRAIN_RESUME);
>                 if (ret) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 15033efec2ba..6ec4f6958c4c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -804,9 +804,15 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev)
>         if (adev->asic_type == CHIP_RENOIR)
>                 return true;
>
> +       sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);

Is this register consistent for all soc15 chips?

Alex

> +
>         /* Just return false for soc15 GPUs.  Reset does not seem to
>          * be necessary.
>          */
> +       if (adev->in_suspend && !adev->pm_complete &&
> +                       sol_reg)
> +               return true;
> +
>         if (!amdgpu_passthrough(adev))
>                 return false;
>
> @@ -816,7 +822,6 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev)
>         /* Check sOS sign of life register to confirm sys driver and sOS
>          * are already been loaded.
>          */
> -       sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
>         if (sol_reg)
>                 return true;
>> --
> 2.34.1
>




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux