On Fri, Jan 13, 2023 at 4:44 AM <vitaly.prosyak@xxxxxxx> wrote: > > From: Vitaly Prosyak <vitaly.prosyak@xxxxxxx> > > Revert the following change: move drm_dev_unplug call after > amdgpu_driver_unload_kms in amdgpu_pci_remove. > The reason is following: amdgpu_pci_remove calls drm_dev_unregister > and it should be called first to ensure userspace can't access the > device instance anymore. If we call drm_dev_unplug after > amdgpu_driver_unload_kms then we observe IGT PCI software unplug > test failure (kernel hung) for all ASIC's. This is how this > regression was found. Maybe just revert the entire patch rather than a partial revert? That said, reverting the change will break driver unload because we need to send certain messages to the PSP to clean things up. I'm not sure what the best way around that is. Alex > > Signed-off-by: Vitaly Prosyak <vitaly.prosyak@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 06aba201d4db..8a1047224000 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -2226,6 +2226,8 @@ amdgpu_pci_remove(struct pci_dev *pdev) > struct drm_device *dev = pci_get_drvdata(pdev); > struct amdgpu_device *adev = drm_to_adev(dev); > > + drm_dev_unplug(dev); > + > if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) { > pm_runtime_get_sync(dev->dev); > pm_runtime_forbid(dev->dev); > @@ -2265,7 +2267,7 @@ amdgpu_pci_remove(struct pci_dev *pdev) > > amdgpu_driver_unload_kms(dev); > > - drm_dev_unplug(dev); > + > > /* > * Flush any in flight DMA operations from device. > -- > 2.25.1 >