[Public] > -----Original Message----- > From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> > Sent: Wednesday, November 24, 2021 9:30 PM > To: Liang, Prike <Prike.Liang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Huang, Ray > <Ray.Huang@xxxxxxx> > Subject: Re: [PATCH v2] drm/amdgpu: reset asic after system-wide suspend > aborted (v2) > > > > On 11/24/2021 6:13 PM, Prike Liang wrote: > > Do ASIC reset at the moment Sx suspend aborted behind of amdgpu > > suspend to keep AMDGPU in a clean reset state and that can avoid > > re-initialize device improperly error. Currently,we just always do > > asic reset in the amdgpu resume until sort out the PM abort case. > > > > v2: Remove incomplete PM abort flag and add GPU hive case check for > > GPU reset. > > > > Signed-off-by: Prike Liang <Prike.Liang@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > index 7d4115d..3fcd90d 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > @@ -3983,6 +3983,14 @@ int amdgpu_device_resume(struct drm_device > *dev, bool fbcon) > > if (adev->in_s0ix) > > amdgpu_gfx_state_change_set(adev, > sGpuChangeState_D0Entry); > > > > + /*TODO: In order to not let all-always asic reset affect resume > latency > > + * need sort out the case which really need asic reset in the resume > process. > > + * As to the known issue on the system suspend abort behind the > AMDGPU suspend, > > + * may can sort this case by checking struct suspend_stats which > need exported > > + * firstly. > > + */ > > + if (adev->gmc.xgmi.num_physical_nodes <= 1) > > + amdgpu_asic_reset(adev); > > Newer dGPUs depend on PMFW to do reset and that is not loaded at this > point. For some, there is a mini FW available which could technically handle a > reset and some of the older ones depend on PSP. Strongly suggest to check > all such cases before doing a reset here. > > Or, the safest at this point could be to do the reset only for APUs. > > Thanks, > Lijo > Thanks for the input, that may need a lot of effort to sort out reset method from many dGPUs. So in this time let's only handle APUs firstly. > > /* post card */ > > if (amdgpu_device_need_post(adev)) { > > r = amdgpu_device_asic_init(adev); > >