On Thu, Feb 10, 2022 at 10:42 AM Limonciello, Mario <Mario.Limonciello@xxxxxxx> wrote: > > [Public] > > > > > -----Original Message----- > > From: Alex Deucher <alexdeucher@xxxxxxxxx> > > Sent: Thursday, February 10, 2022 09:28 > > To: Limonciello, Mario <Mario.Limonciello@xxxxxxx> > > Cc: Mahapatra, Rajib <Rajib.Mahapatra@xxxxxxx>; Liang, Prike > > <Prike.Liang@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; > > amd-gfx@xxxxxxxxxxxxxxxxxxxxx; S, Shirish <Shirish.S@xxxxxxx> > > Subject: Re: [PATCH v2] drm/amdgpu: skipping SDMA hw_init and hw_fini for > > S0ix. > > > > On Thu, Feb 10, 2022 at 9:04 AM Limonciello, Mario > > <Mario.Limonciello@xxxxxxx> wrote: > > > > > > [Public] > > > > > > > > > > > > > -----Original Message----- > > > > From: Mahapatra, Rajib <Rajib.Mahapatra@xxxxxxx> > > > > Sent: Thursday, February 10, 2022 07:35 > > > > To: Liang, Prike <Prike.Liang@xxxxxxx>; Limonciello, Mario > > > > <Mario.Limonciello@xxxxxxx>; Deucher, Alexander > > > > <Alexander.Deucher@xxxxxxx> > > > > Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; S, Shirish <Shirish.S@xxxxxxx>; > > > > Mahapatra, Rajib <Rajib.Mahapatra@xxxxxxx> > > > > Subject: [PATCH v2] drm/amdgpu: skipping SDMA hw_init and hw_fini for > > S0ix. > > > > > > > > [Why] > > > > SDMA ring buffer test failed if suspend is aborted during > > > > S0i3 resume. > > > > > > > > [How] > > > > If suspend is aborted for some reason during S0i3 resume > > > > cycle, it follows SDMA ring test failing and errors in amdgpu > > > > resume. For RN/CZN/Picasso, SMU saves and restores SDMA > > > > registers during S0ix cycle. So, skipping SDMA suspend and > > > > resume from driver solves the issue. This time, the system > > > > is able to resume gracefully even the suspend is aborted. > > > > > > > > v2: add changes on sdma_v4, skipping SDMA hw_init and hw_fini. > > > > > > This line in the commit message should be "below" the --- > > > > > > Besides that the code is better. > > > > > > Reviewed-by: Mario Limonciello <mario.limonciello@xxxxxxx> > > > > Reviewed-by: Alex Deucher <alexander.deucher@xxxxxxx> > > > > I presume sdma_v5.2.c needs a similar fix? > > VG doesn't do s0i3 right? Right. > No, YC should not take a similar fix. YC had an architectural change and to > avoid a "similar" problem takes 26db706a6d77b9e184feb11725e97e53b7a89519. Isn't that likely just a workaround for the same issue? This seems cleaner. Alex > > > > > Alex > > > > > > > > > > > Signed-off-by: Rajib Mahapatra <rajib.mahapatra@xxxxxxx> > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 ++++++++ > > > > 1 file changed, 8 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > > > > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > > > > index 06a7ceda4c87..02115d63b071 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > > > > @@ -2058,6 +2058,10 @@ static int sdma_v4_0_suspend(void *handle) > > > > { > > > > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > > > > > > > + /* SMU saves SDMA state for us */ > > > > + if (adev->in_s0ix) > > > > + return 0; > > > > + > > > > return sdma_v4_0_hw_fini(adev); > > > > } > > > > > > > > @@ -2065,6 +2069,10 @@ static int sdma_v4_0_resume(void *handle) > > > > { > > > > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > > > > > > > + /* SMU restores SDMA state for us */ > > > > + if (adev->in_s0ix) > > > > + return 0; > > > > + > > > > return sdma_v4_0_hw_init(adev); > > > > } > > > > > > > > -- > > > > 2.25.1