On Tue, May 16, 2023 at 10:23 PM Limonciello, Mario <mlimonci@xxxxxxx> wrote: > > > On 5/16/2023 4:57 PM, Alex Deucher wrote: > > On Tue, May 16, 2023 at 5:50 PM Limonciello, Mario <mlimonci@xxxxxxx> wrote: > >> > >> On 5/16/2023 4:39 PM, Alex Deucher wrote: > >>> On Tue, May 16, 2023 at 2:15 PM Mario Limonciello > >>> <mario.limonciello@xxxxxxx> wrote: > >>>> On GFX11 if RLC is stopped when not in GFXOFF the system will hang. > >>>> Prevent this case from ever happening. > >>>> > >>>> Tested-by: Juan Martinez <Juan.Martinez@xxxxxxx> > >>>> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> > >>>> --- > >>>> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 ++++ > >>>> 1 file changed, 4 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > >>>> index dcbdb2641086..f1f879d9ed8d 100644 > >>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > >>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > >>>> @@ -1766,6 +1766,10 @@ static void gfx_v11_0_rlc_stop(struct amdgpu_device *adev) > >>>> { > >>>> u32 tmp = RREG32_SOC15(GC, 0, regRLC_CNTL); > >>>> > >>>> + if (!adev->gfx.gfx_off_state) { > >>>> + dev_err(adev->dev, "GFX is not in GFXOFF\n"); > >>>> + return; > >>>> + } > >>> This should move up before the RREG above? Also, I think it would be > >>> cleaner to just not mess with the RLC in S0i3. Can we just return > >>> early in smu_disable_dpms() for the APU case? All of the DPM features > >>> are controlled by the SMU so that function is mostly a nop of APUs > >>> anyway. > >>> > >>> Alex > >> That was what the original attempt did when we first identified this issue. > >> Unfortunately though just skipping RLC (without patches 1 and 2) means > >> that GFXOFF still either doesn't get toggled at suspend entry or isn't fully > >> > >> off at suspend entry. > >> > >> This leads to the graphics core behaving erratically upon resume. > >> > >> So if you're OK with patches 1 and 2, I'll adjust patch 3 to also skip > >> RLC for > >> APU. > > Sure. > OK, let me double check RLC skip and I'll send out a v2. > > I wonder if we need something similar as patch 2 for other APUs? > I expect patch 1 "alone" to help Renoir and Cezanne hitting a similar > circumstance. > For Rembrandt and Mendocino, they don't have IMU, so what would you poll? For older chips, mmPWR_MISC_CNTL_STATUS (smu10), mmSMUIO_GFX_MISC_CNTL (smu12). See smu_v12_0_gfx_off_control() and smu10_disable_gfx_off(). It looks like smu_v13_0_gfx_off_control() doesn't wait for gfxoff like the other functions. Alex > > > > Thinking out loud here, I wonder if we shouldn't just return early in > > the top level suspend/resume functions for S0i3. > > I think this can make sense for GFX10 and GFX11 maybe, but as it's already > bifurcated I think it's probably better to do case by case basis. >