RE: [PATCH] drm/amdgpu: only save and restore GPU device config at GPU error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Public]

Hi, Lijo

> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
> Sent: Thursday, July 27, 2023 6:49 PM
> To: Liang, Prike <Prike.Liang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
> Subject: Re: [PATCH] drm/amdgpu: only save and restore GPU device config
> at GPU error
>
>
>
> On 7/27/2023 3:20 PM, Prike Liang wrote:
> > There's need a check on the GPU error state before save and restore
> > GPU device config space.
> >
> > Signed-off-by: Prike Liang <Prike.Liang@xxxxxxx>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 358dcc1070c5..5ef3c5c49bee 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3946,7 +3946,8 @@ int amdgpu_device_init(struct amdgpu_device
> *adev,
> >             dev_err(adev->dev, "amdgpu_pmu_init failed\n");
> >
> >     /* Have stored pci confspace at hand for restore in sudden PCI error
> */
> > -   if (amdgpu_device_cache_pci_state(adev->pdev))
> > +   if (adev->pdev->error_state != pci_channel_io_normal &&
> > +                   amdgpu_device_cache_pci_state(adev->pdev))
>
> We need the clean state to be cached, not the state when there is an error.
> This state is later used to restore later, say when a mode-2 reset happens.
>
> Thanks,
> Lijo
>
But the original code in the amdgpu_device_init () is trying to restore the GPU config space immediately after the GPU config space saved, so
that looks this GPU save state restored unconditionally and that's not only for the GPU error case? Meanwhile, as to the mode-2 or some other reset case the reset
function will do the save and restore GPU state separately and not use the GPU state saved in amdgpu_device_init() process. In the amdgpu_device_init(), how
about restore the GPU state only for the GPU error case?

> >             pci_restore_state(pdev);
> >
> >     /* if we have > 1 VGA cards, then disable the amdgpu VGA resources
> > */




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux