Re: [PATCH] drm/amdgpu: Fix the warning info when removing amdgpu device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The commit message reads a bit bumpy. Generally best practice are:

Short (72 chars or less) summary

More detailed explanatory text. Wrap it to 72 characters. The blank
line separating the summary from the body is critical (unless you omit
the body entirely).

Write your commit message in the imperative: "Fix bug" and not "Fixed
bug" or "Fixes bug." This convention matches up with commit messages
generated by commands like git merge and git revert.

Further paragraphs come after blank lines.

- Bullet points are okay, too.
- Typically a hyphen or asterisk is used for the bullet, followed by a
  single space. Use a hanging indent.

Apart from that the patch is Acked-by: Christian König <christian.koenig@xxxxxxx>

Regards,
Christian.

Am 07.03.23 um 03:23 schrieb Chen, Guchun:
Reviewed-by: Guchun Chen <guchun.chen@xxxxxxx>

Regards,
Guchun

-----Original Message-----
From: lyndonli <Lyndon.Li@xxxxxxx>
Sent: Tuesday, March 7, 2023 10:12 AM
To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Chen, Guchun <Guchun.Chen@xxxxxxx>; Xu, Feifei <Feifei.Xu@xxxxxxx>; Ma, Jun <Jun.Ma2@xxxxxxx>; Li, Lyndon <Lyndon.Li@xxxxxxx>
Subject: [PATCH] drm/amdgpu: Fix the warning info when removing amdgpu device

Actually, the drm_dev_enter in psp_cmd_submit_buf does not protect anything.
And it is not used to prevent concurrent access.
If DRM device is unplugged, it will always check the condition in WARN_ON.
We'd better not keep adding commands to the list.
Simply moving the drm_dev_enter/drm_dev_exit higher level will not solve the issue.
Because psp_cmd_submit_buf is called in many places, such as psp_hw_init-->psp_load_fw, psp_suspend-->psp_xgmi_terminate, amdgpu_device_gpu_recover-->amdgpu_ras_suspend.
So drop drm_dev_enter/drm_dev_exit in psp_cmd_submit_buf.

When removing amdgpu, the calling order as follows:
amdgpu_pci_remove
	drm_dev_unplug
	amdgpu_driver_unload_kms
		amdgpu_device_fini_hw
			amdgpu_device_ip_fini_early
				psp_hw_fini
					psp_ras_terminate
						psp_ta_unloadye
							psp_cmd_submit_buf

[ 4507.740388] Call Trace:
[ 4507.740389]  <TASK>
[ 4507.740391]  psp_ta_unload+0x44/0x70 [amdgpu] [ 4507.740485]  psp_ras_terminate+0x4d/0x70 [amdgpu] [ 4507.740575]  psp_hw_fini+0x28/0xa0 [amdgpu] [ 4507.740662]  amdgpu_device_fini_hw+0x328/0x442 [amdgpu] [ 4507.740791]  amdgpu_driver_unload_kms+0x51/0x60 [amdgpu] [ 4507.740875]  amdgpu_pci_remove+0x5a/0x140 [amdgpu] [ 4507.740962]  ? _raw_spin_unlock_irqrestore+0x27/0x43
[ 4507.740965]  ? __pm_runtime_resume+0x60/0x90 [ 4507.740968]  pci_device_remove+0x39/0xb0 [ 4507.740971]  device_remove+0x46/0x70 [ 4507.740972]  device_release_driver_internal+0xd1/0x160
[ 4507.740974]  driver_detach+0x4a/0x90
[ 4507.740975]  bus_remove_driver+0x6c/0xf0 [ 4507.740976]  driver_unregister+0x31/0x50 [ 4507.740977]  pci_unregister_driver+0x40/0x90 [ 4507.740978]  amdgpu_exit+0x15/0x120 [amdgpu]

Signed-off-by: lyndonli <Lyndon.Li@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 +----------------
  1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 4c617faaa7c9..02f948adae72 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -603,27 +603,14 @@ psp_cmd_submit_buf(struct psp_context *psp,
  		   struct psp_gfx_cmd_resp *cmd, uint64_t fence_mc_addr)  {
  	int ret;
-	int index, idx;
+	int index;
  	int timeout = 20000;
  	bool ras_intr = false;
  	bool skip_unsupport = false;
-	bool dev_entered;
if (psp->adev->no_hw_access)
  		return 0;
- dev_entered = drm_dev_enter(adev_to_drm(psp->adev), &idx);
-	/*
-	 * We allow sending PSP messages LOAD_ASD and UNLOAD_TA without acquiring
-	 * a lock in drm_dev_enter during driver unload because we must call
-	 * drm_dev_unplug as the beginning  of unload driver sequence . It is very
-	 * crucial that userspace can't access device instances anymore.
-	 */
-	if (!dev_entered)
-		WARN_ON(psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_LOAD_ASD &&
-			psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_UNLOAD_TA &&
-			psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_INVOKE_CMD);
-
  	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
memcpy(psp->cmd_buf_mem, cmd, sizeof(struct psp_gfx_cmd_resp)); @@ -687,8 +674,6 @@ psp_cmd_submit_buf(struct psp_context *psp,
  	}
exit:
-	if (dev_entered)
-		drm_dev_exit(idx);
  	return ret;
  }
--
2.34.1





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux