[AMD Official Use Only - General] Thanks, will update it. Regards, Lyndon > -----Original Message----- > From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> > Sent: Tuesday, March 7, 2023 2:22 PM > To: Chen, Guchun <Guchun.Chen@xxxxxxx>; Li, Lyndon > <Lyndon.Li@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Xu, Feifei <Feifei.Xu@xxxxxxx>; Ma, Jun <Jun.Ma2@xxxxxxx>; > Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx>; Deucher, Alexander > <Alexander.Deucher@xxxxxxx>; Koenig, Christian > <Christian.Koenig@xxxxxxx> > Subject: Re: [PATCH] drm/amdgpu: Fix the warning info when removing > amdgpu device > > The commit message reads a bit bumpy. Generally best practice are: > > Short (72 chars or less) summary > > More detailed explanatory text. Wrap it to 72 characters. The blank line > separating the summary from the body is critical (unless you omit the body > entirely). > > Write your commit message in the imperative: "Fix bug" and not "Fixed bug" > or "Fixes bug." This convention matches up with commit messages > generated by commands like git merge and git revert. > > Further paragraphs come after blank lines. > > - Bullet points are okay, too. > - Typically a hyphen or asterisk is used for the bullet, followed by a > single space. Use a hanging indent. > > Apart from that the patch is Acked-by: Christian König > <christian.koenig@xxxxxxx> > > Regards, > Christian. > > Am 07.03.23 um 03:23 schrieb Chen, Guchun: > > Reviewed-by: Guchun Chen <guchun.chen@xxxxxxx> > > > > Regards, > > Guchun > > > > -----Original Message----- > > From: lyndonli <Lyndon.Li@xxxxxxx> > > Sent: Tuesday, March 7, 2023 10:12 AM > > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > Cc: Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx>; Koenig, Christian > > <Christian.Koenig@xxxxxxx>; Deucher, Alexander > > <Alexander.Deucher@xxxxxxx>; Chen, Guchun > <Guchun.Chen@xxxxxxx>; Xu, > > Feifei <Feifei.Xu@xxxxxxx>; Ma, Jun <Jun.Ma2@xxxxxxx>; Li, Lyndon > > <Lyndon.Li@xxxxxxx> > > Subject: [PATCH] drm/amdgpu: Fix the warning info when removing > amdgpu > > device > > > > Actually, the drm_dev_enter in psp_cmd_submit_buf does not protect > anything. > > And it is not used to prevent concurrent access. > > If DRM device is unplugged, it will always check the condition in WARN_ON. > > We'd better not keep adding commands to the list. > > Simply moving the drm_dev_enter/drm_dev_exit higher level will not > solve the issue. > > Because psp_cmd_submit_buf is called in many places, such as > psp_hw_init-->psp_load_fw, psp_suspend-->psp_xgmi_terminate, > amdgpu_device_gpu_recover-->amdgpu_ras_suspend. > > So drop drm_dev_enter/drm_dev_exit in psp_cmd_submit_buf. > > > > When removing amdgpu, the calling order as follows: > > amdgpu_pci_remove > > drm_dev_unplug > > amdgpu_driver_unload_kms > > amdgpu_device_fini_hw > > amdgpu_device_ip_fini_early > > psp_hw_fini > > psp_ras_terminate > > psp_ta_unloadye > > > psp_cmd_submit_buf > > > > [ 4507.740388] Call Trace: > > [ 4507.740389] <TASK> > > [ 4507.740391] psp_ta_unload+0x44/0x70 [amdgpu] [ 4507.740485] > > psp_ras_terminate+0x4d/0x70 [amdgpu] [ 4507.740575] > > psp_hw_fini+0x28/0xa0 [amdgpu] [ 4507.740662] > > amdgpu_device_fini_hw+0x328/0x442 [amdgpu] [ 4507.740791] > > amdgpu_driver_unload_kms+0x51/0x60 [amdgpu] [ 4507.740875] > > amdgpu_pci_remove+0x5a/0x140 [amdgpu] [ 4507.740962] ? > > _raw_spin_unlock_irqrestore+0x27/0x43 > > [ 4507.740965] ? __pm_runtime_resume+0x60/0x90 [ 4507.740968] > > pci_device_remove+0x39/0xb0 [ 4507.740971] device_remove+0x46/0x70 > [ > > 4507.740972] device_release_driver_internal+0xd1/0x160 > > [ 4507.740974] driver_detach+0x4a/0x90 [ 4507.740975] > > bus_remove_driver+0x6c/0xf0 [ 4507.740976] > > driver_unregister+0x31/0x50 [ 4507.740977] > > pci_unregister_driver+0x40/0x90 [ 4507.740978] amdgpu_exit+0x15/0x120 > > [amdgpu] > > > > Signed-off-by: lyndonli <Lyndon.Li@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 +---------------- > > 1 file changed, 1 insertion(+), 16 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > index 4c617faaa7c9..02f948adae72 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > @@ -603,27 +603,14 @@ psp_cmd_submit_buf(struct psp_context *psp, > > struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr) > { > > int ret; > > - int index, idx; > > + int index; > > int timeout = 20000; > > bool ras_intr = false; > > bool skip_unsupport = false; > > - bool dev_entered; > > > > if (psp->adev->no_hw_access) > > return 0; > > > > - dev_entered = drm_dev_enter(adev_to_drm(psp->adev), &idx); > > - /* > > - * We allow sending PSP messages LOAD_ASD and UNLOAD_TA > without acquiring > > - * a lock in drm_dev_enter during driver unload because we must call > > - * drm_dev_unplug as the beginning of unload driver sequence . It is > very > > - * crucial that userspace can't access device instances anymore. > > - */ > > - if (!dev_entered) > > - WARN_ON(psp->cmd_buf_mem->cmd_id != > GFX_CMD_ID_LOAD_ASD && > > - psp->cmd_buf_mem->cmd_id != > GFX_CMD_ID_UNLOAD_TA && > > - psp->cmd_buf_mem->cmd_id != > GFX_CMD_ID_INVOKE_CMD); > > - > > memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE); > > > > memcpy(psp->cmd_buf_mem, cmd, sizeof(struct > psp_gfx_cmd_resp)); @@ -687,8 +674,6 @@ psp_cmd_submit_buf(struct > psp_context *psp, > > } > > > > exit: > > - if (dev_entered) > > - drm_dev_exit(idx); > > return ret; > > } > > > > -- > > 2.34.1 > >