[AMD Official Use Only - General] > -----Original Message----- > From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> > Sent: Thursday, March 2, 2023 5:21 PM > To: Quan, Evan <Evan.Quan@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx> > Subject: Re: [PATCH] drm/amdgpu: disable cstate properly for driver > reloading scenario > > > > On 3/2/2023 2:43 PM, Quan, Evan wrote: > > [AMD Official Use Only - General] > > > > > > > >> -----Original Message----- > >> From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> > >> Sent: Thursday, March 2, 2023 4:28 PM > >> To: Quan, Evan <Evan.Quan@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > >> Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx> > >> Subject: Re: [PATCH] drm/amdgpu: disable cstate properly for driver > >> reloading scenario > >> > >> > >> > >> On 3/2/2023 12:28 PM, Evan Quan wrote: > >>> Gpu reset might be needed during driver reloading. To guard that(gpu > >>> reset) work, df cstate needs to be disabled properly. > >>> > >>> Signed-off-by: Evan Quan <evan.quan@xxxxxxx> > >>> Change-Id: I5c074c265c0b08a67b6934ae1ad9aa3fed245461 > >>> --- > >>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++++++ > >>> 1 file changed, 9 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> index 51bbeaa1f311..3c854461ef32 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> @@ -2816,6 +2816,15 @@ static int amdgpu_device_ip_fini_early(struct > >> amdgpu_device *adev) > >>> amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); > >>> amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); > >>> > >>> + /* > >>> + * Get df cstate disabled properly on driver unloading. > >>> + * Since on the succeeding driver reloading, gpu reset might > >>> + * be required. And cstate disabled is a prerequisite for > >>> + * that(gpu reset). > >>> + */ > >>> + if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW)) > >>> + dev_warn(adev->dev, "Failed to disallow df cstate"); > >>> + > >> > >> This looks more like a firmware bug. Driver sends the Unload message to > FW. > >> In that case FW should disable all features including C-state. > > Driver does not send the Unload message. We want PMFM alive and ready > for handling possible gpu reset on reloading. > > > > Actually, soc21_need_reset_on_init code itself has a bug. PSP won't get > unloaded by default on ring destruction. Even if PSP stops, it could just keep > the heartbeat value as non-zero (just that it won't increment). > > Probably, that needs to be fixed first rather than keeping PMFW alive for a > reset. As I remembered, the change(asic reset during reloading) seemed introduced to address some sriov issues. @Deucher, Alexander might share more backgrounds about this. To be honest, I'm not a fan of this(perform asic reset during reloading). Evan > > Thanks, > Lijo > > > BR > > Evan > >> > >> Thanks, > >> Lijo > >> > >>> amdgpu_amdkfd_suspend(adev, false); > >>> > >>> /* Workaroud for ASICs need to disable SMC first */