RE: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Public]

> -----Original Message-----
> From: Alex Deucher <alexdeucher@xxxxxxxxx>
> Sent: Monday, November 22, 2021 11:48 PM
> To: Liang, Prike <Prike.Liang@xxxxxxx>
> Cc: Lazar, Lijo <Lijo.Lazar@xxxxxxx>; Deucher, Alexander
> <Alexander.Deucher@xxxxxxx>; Christian König
> <ckoenig.leichtzumerken@xxxxxxxxx>; Huang, Ray <Ray.Huang@xxxxxxx>;
> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> aborted
>
> On Mon, Nov 22, 2021 at 9:23 AM Liang, Prike <Prike.Liang@xxxxxxx>
> wrote:
> >
> > [Public]
> >
> > > -----Original Message-----
> > > From: Alex Deucher <alexdeucher@xxxxxxxxx>
> > > Sent: Friday, November 19, 2021 12:18 AM
> > > To: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
> > > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Christian König
> > > <ckoenig.leichtzumerken@xxxxxxxxx>; Liang, Prike
> > > <Prike.Liang@xxxxxxx>; Huang, Ray <Ray.Huang@xxxxxxx>;
> > > amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide
> > > suspend aborted
> > >
> > > On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo <Lijo.Lazar@xxxxxxx>
> wrote:
> > > >
> > > > [Public]
> > > >
> > > >
> > > > BTW, I'm not sure if 'reset always' on resume is a good idea  for
> > > > GPUs in a
> > > hive (assuming those systems also get suspended and get hiccups). At
> > > this point the hive isn't reinitialized.
> > >
> > > Yeah, we should probably not reset if we are part of a hive.
> > >
> > > Alex
> > >
> > For the GPU hive reset in this suspend abort case need treat specially, does
> that because of GPU hive need take care each node reset dependence and
> synchronous reset? For this purpose, can we skip the hive reset case and
> only do GPU reset under adev->gmc.xgmi.num_physical_nodes == 0 ?
>
> Yes, exactly.  For the aborted suspend reset, we can check the value before
> doing a reset.  I think you want to check if
> adev->gmc.xgmi.num_physical_nodes <= 1.
>
> Alex
>
Thanks for the clarification and will add this checking for GPU reset in the amdgpu_device_resume().
> >
> > > >
> > > > Thanks,
> > > > Lijo




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux