On Mon, Nov 15, 2021 at 2:50 AM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote: > > > > On 11/12/2021 9:55 PM, Alex Deucher wrote: > > If the platform suspend happens to fail and the power rail > > is not turned off, the GPU will be in an unknown state on > > resume, so reset the asic so that it will be in a known > > good state on resume even if the platform suspend failed. > > > > Any more background info on the issue? Is there a need to trigger BACO > or D3cold entry similar to how it's done for runtime suspend? Basically something like the following, user requests S3, drivers start to do their suspend thing, but then something interrupts it (e.g., user plugs/unplugs a usb device or S3 gets interrupted for something). At that point, the power rail has not been turned off. The kernel then starts calling the resume functions for each device because the suspend was aborted. However, since the power rail was not turned off, the GPU is still initialized so the driver can't properly re-init it without a reset. Alex > > Thanks, > Lijo > > > v2: handle s0ix > > > > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > index 1db76429a673..b4591f6e82dd 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > @@ -2165,7 +2165,10 @@ static int amdgpu_pmops_suspend(struct device *dev) > > adev->in_s3 = true; > > r = amdgpu_device_suspend(drm_dev, true); > > adev->in_s3 = false; > > - > > + if (r) > > + return r; > > + if (!adev->in_s0ix) > > + r = amdgpu_asic_reset(adev); > > return r; > > } > > > >