Sorry, the code base was incorrect. I just sent out a v2. Regards, Lyndon > -----Original Message----- > From: Chen, Guchun <Guchun.Chen@xxxxxxx> > Sent: Thursday, March 2, 2023 1:28 PM > To: Li, Lyndon <Lyndon.Li@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Koenig, Christian <Christian.Koenig@xxxxxxx>; Chai, Thomas > <YiPeng.Chai@xxxxxxx>; Xu, Feifei <Feifei.Xu@xxxxxxx>; Li, Lyndon > <Lyndon.Li@xxxxxxx> > Subject: RE: [PATCH] drm/amdgpu: Fix call trace warning and hang when > removing amdgpu device > > - adev->in_suspend || adev->ddev.unplugged) > > I don't think the code base is correct. Please double check it. > > Regards, > Guchun > > -----Original Message----- > From: lyndonli <Lyndon.Li@xxxxxxx> > Sent: Thursday, March 2, 2023 12:57 PM > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Koenig, Christian <Christian.Koenig@xxxxxxx>; Chen, Guchun > <Guchun.Chen@xxxxxxx>; Chai, Thomas <YiPeng.Chai@xxxxxxx>; Xu, > Feifei <Feifei.Xu@xxxxxxx>; Li, Lyndon <Lyndon.Li@xxxxxxx> > Subject: [PATCH] drm/amdgpu: Fix call trace warning and hang when > removing amdgpu device > > On GPUs with RAS enabled, below call trace and hang are observed when > shutting down device. > > v2: use DRM device unplugged flag instead of shutdown flag as the check to > prevent memory wipe in shutdown stage. > > [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu] > [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: > [ +0.000001] <TASK> > [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu] [ +0.000183] > amdgpu_bo_fini+0x27/0xa0 [amdgpu] [ +0.000184] > gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu] [ +0.000163] > amdgpu_device_fini_sw+0xb6/0x510 [amdgpu] [ +0.000152] > amdgpu_driver_release_kms+0x16/0x30 [amdgpu] [ +0.000090] > drm_dev_release+0x28/0x50 [drm] [ +0.000016] > devm_drm_dev_init_release+0x38/0x60 [drm] [ +0.000011] > devm_action_release+0x15/0x20 [ +0.000003] release_nodes+0x40/0xc0 > [ +0.000001] devres_release_all+0x9e/0xe0 [ +0.000001] > device_unbind_cleanup+0x12/0x80 [ +0.000003] > device_release_driver_internal+0xff/0x160 > [ +0.000001] driver_detach+0x4a/0x90 > [ +0.000001] bus_remove_driver+0x6c/0xf0 [ +0.000001] > driver_unregister+0x31/0x50 [ +0.000001] pci_unregister_driver+0x40/0x90 > [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu] > > Signed-off-by: lyndonli <Lyndon.Li@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index d50f8bfb9be9..5554ff22d724 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -1314,7 +1314,7 @@ void amdgpu_bo_release_notify(struct > ttm_buffer_object *bo) > > if (!bo->resource || bo->resource->mem_type != TTM_PL_VRAM || > !(abo->flags & > AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE) || > - adev->in_suspend || adev->ddev.unplugged) > + adev->in_suspend || > drm_dev_is_unplugged(adev_to_drm(adev))) > return; > > if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv))) > -- > 2.34.1