RE: [PATCH] drm/amdgpu: Fix the warning info when removing amdgpu device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

Thanks, will update it.

Regards,
Lyndon

> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>
> Sent: Tuesday, March 7, 2023 2:22 PM
> To: Chen, Guchun <Guchun.Chen@xxxxxxx>; Li, Lyndon
> <Lyndon.Li@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Xu, Feifei <Feifei.Xu@xxxxxxx>; Ma, Jun <Jun.Ma2@xxxxxxx>;
> Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx>; Deucher, Alexander
> <Alexander.Deucher@xxxxxxx>; Koenig, Christian
> <Christian.Koenig@xxxxxxx>
> Subject: Re: [PATCH] drm/amdgpu: Fix the warning info when removing
> amdgpu device
> 
> The commit message reads a bit bumpy. Generally best practice are:
> 
> Short (72 chars or less) summary
> 
> More detailed explanatory text. Wrap it to 72 characters. The blank line
> separating the summary from the body is critical (unless you omit the body
> entirely).
> 
> Write your commit message in the imperative: "Fix bug" and not "Fixed bug"
> or "Fixes bug." This convention matches up with commit messages
> generated by commands like git merge and git revert.
> 
> Further paragraphs come after blank lines.
> 
> - Bullet points are okay, too.
> - Typically a hyphen or asterisk is used for the bullet, followed by a
>    single space. Use a hanging indent.
> 
> Apart from that the patch is Acked-by: Christian König
> <christian.koenig@xxxxxxx>
> 
> Regards,
> Christian.
> 
> Am 07.03.23 um 03:23 schrieb Chen, Guchun:
> > Reviewed-by: Guchun Chen <guchun.chen@xxxxxxx>
> >
> > Regards,
> > Guchun
> >
> > -----Original Message-----
> > From: lyndonli <Lyndon.Li@xxxxxxx>
> > Sent: Tuesday, March 7, 2023 10:12 AM
> > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> > Cc: Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx>; Koenig, Christian
> > <Christian.Koenig@xxxxxxx>; Deucher, Alexander
> > <Alexander.Deucher@xxxxxxx>; Chen, Guchun
> <Guchun.Chen@xxxxxxx>; Xu,
> > Feifei <Feifei.Xu@xxxxxxx>; Ma, Jun <Jun.Ma2@xxxxxxx>; Li, Lyndon
> > <Lyndon.Li@xxxxxxx>
> > Subject: [PATCH] drm/amdgpu: Fix the warning info when removing
> amdgpu
> > device
> >
> > Actually, the drm_dev_enter in psp_cmd_submit_buf does not protect
> anything.
> > And it is not used to prevent concurrent access.
> > If DRM device is unplugged, it will always check the condition in WARN_ON.
> > We'd better not keep adding commands to the list.
> > Simply moving the drm_dev_enter/drm_dev_exit higher level will not
> solve the issue.
> > Because psp_cmd_submit_buf is called in many places, such as
> psp_hw_init-->psp_load_fw, psp_suspend-->psp_xgmi_terminate,
> amdgpu_device_gpu_recover-->amdgpu_ras_suspend.
> > So drop drm_dev_enter/drm_dev_exit in psp_cmd_submit_buf.
> >
> > When removing amdgpu, the calling order as follows:
> > amdgpu_pci_remove
> > 	drm_dev_unplug
> > 	amdgpu_driver_unload_kms
> > 		amdgpu_device_fini_hw
> > 			amdgpu_device_ip_fini_early
> > 				psp_hw_fini
> > 					psp_ras_terminate
> > 						psp_ta_unloadye
> >
> 	psp_cmd_submit_buf
> >
> > [ 4507.740388] Call Trace:
> > [ 4507.740389]  <TASK>
> > [ 4507.740391]  psp_ta_unload+0x44/0x70 [amdgpu] [ 4507.740485]
> > psp_ras_terminate+0x4d/0x70 [amdgpu] [ 4507.740575]
> > psp_hw_fini+0x28/0xa0 [amdgpu] [ 4507.740662]
> > amdgpu_device_fini_hw+0x328/0x442 [amdgpu] [ 4507.740791]
> > amdgpu_driver_unload_kms+0x51/0x60 [amdgpu] [ 4507.740875]
> > amdgpu_pci_remove+0x5a/0x140 [amdgpu] [ 4507.740962]  ?
> > _raw_spin_unlock_irqrestore+0x27/0x43
> > [ 4507.740965]  ? __pm_runtime_resume+0x60/0x90 [ 4507.740968]
> > pci_device_remove+0x39/0xb0 [ 4507.740971]  device_remove+0x46/0x70
> [
> > 4507.740972]  device_release_driver_internal+0xd1/0x160
> > [ 4507.740974]  driver_detach+0x4a/0x90 [ 4507.740975]
> > bus_remove_driver+0x6c/0xf0 [ 4507.740976]
> > driver_unregister+0x31/0x50 [ 4507.740977]
> > pci_unregister_driver+0x40/0x90 [ 4507.740978]  amdgpu_exit+0x15/0x120
> > [amdgpu]
> >
> > Signed-off-by: lyndonli <Lyndon.Li@xxxxxxx>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 +----------------
> >   1 file changed, 1 insertion(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > index 4c617faaa7c9..02f948adae72 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > @@ -603,27 +603,14 @@ psp_cmd_submit_buf(struct psp_context *psp,
> >   		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)
> {
> >   	int ret;
> > -	int index, idx;
> > +	int index;
> >   	int timeout = 20000;
> >   	bool ras_intr = false;
> >   	bool skip_unsupport = false;
> > -	bool dev_entered;
> >
> >   	if (psp->adev->no_hw_access)
> >   		return 0;
> >
> > -	dev_entered = drm_dev_enter(adev_to_drm(psp->adev), &idx);
> > -	/*
> > -	 * We allow sending PSP messages LOAD_ASD and UNLOAD_TA
> without acquiring
> > -	 * a lock in drm_dev_enter during driver unload because we must call
> > -	 * drm_dev_unplug as the beginning  of unload driver sequence . It is
> very
> > -	 * crucial that userspace can't access device instances anymore.
> > -	 */
> > -	if (!dev_entered)
> > -		WARN_ON(psp->cmd_buf_mem->cmd_id !=
> GFX_CMD_ID_LOAD_ASD &&
> > -			psp->cmd_buf_mem->cmd_id !=
> GFX_CMD_ID_UNLOAD_TA &&
> > -			psp->cmd_buf_mem->cmd_id !=
> GFX_CMD_ID_INVOKE_CMD);
> > -
> >   	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
> >
> >   	memcpy(psp->cmd_buf_mem, cmd, sizeof(struct
> psp_gfx_cmd_resp)); @@ -687,8 +674,6 @@ psp_cmd_submit_buf(struct
> psp_context *psp,
> >   	}
> >
> >   exit:
> > -	if (dev_entered)
> > -		drm_dev_exit(idx);
> >   	return ret;
> >   }
> >
> > --
> > 2.34.1
> >




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux