[v5 3/5] drm/amdgpu: fix invalid memory access in amdgpu_xcp_cfg_sysfs_fini()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When AMD gpu firmware files are missing, loading the amdgpu driver will
cause following invalid memory access:

[   89.735573] amdgpu 0000:0a:00.0: amdgpu: Fetched VBIOS from platform
[   89.735583] amdgpu: ATOM BIOS: 113-M3080202-101
[   89.735676] amdgpu 0000:0a:00.0: Direct firmware load for amdgpu/psp_13_0_14_sos.bin failed with error -2
[   89.735684] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <psp> failed -19
[   89.746649] amdgpu 0000:0a:00.0: Direct firmware load for amdgpu/smu_13_0_14.bin failed with error -2
[   89.746657] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <smu> failed -19
[   89.757203] amdgpu 0000:0a:00.0: Direct firmware load for amdgpu/gc_9_4_4_rlc.bin failed with error -2
[   89.757209] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <gfx_v9_4_3> failed -19
[   89.768331] amdgpu 0000:0a:00.0: Direct firmware load for amdgpu/sdma_4_4_5.bin failed with error -2
[   89.768341] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <sdma_v4_4_2> failed -19
[   89.779385] amdgpu 0000:0a:00.0: Direct firmware load for amdgpu/vcn_4_0_3.bin failed with error -2
[   89.779389] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <vcn_v4_0_3> failed -19
[   89.790320] amdgpu 0000:0a:00.0: amdgpu: Fatal error during GPU init
[   89.797498] amdgpu 0000:0a:00.0: amdgpu: amdgpu: finishing device.
[   89.797519] BUG: kernel NULL pointer dereference, address: 0000000000000010
[   89.805377] #PF: supervisor read access in kernel mode
[   89.811639] #PF: error_code(0x0000) - not-present page
[   89.817825] PGD 0
[   89.820455] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[   89.826296] CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.12.0+ #3
[   89.834265] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 3.0.ES.AL.P.091.02 08/04/2024
[   89.845430] Workqueue: events work_for_cpu_fn
[   89.850649] RIP: 0010:amdgpu_device_fini_hw+0x1f2/0x570 [amdgpu]
[   89.858316] Code: 65 49 63 f5 48 83 fe 11 0f 83 0c 03 00 00 4c 89 ff 49 83 ec 20 4d 8d 7f e0 48 85 ff 74 48 41 83 ed 01 49 8b 84 24 18 6f 04 00 <48> 8b 40 10 48 8b 40 28 48 85 c0 74 c4 ff d0 0f 1f 00 41 89 c1 85
[   89.879966] RSP: 0018:ff23a0f9401dbd18 EFLAGS: 00010206
[   89.886123] RAX: 0000000000000000 RBX: ff13ecb721800000 RCX: 0000000000000000
[   89.894411] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff13ecb721847028
[   89.902705] RBP: ff23a0f9401dbd68 R08: 0000000000000000 R09: 0000000000000000
[   89.910983] R10: 0000000000000000 R11: 0000000000000000 R12: ff13ecb721800120
[   89.919249] R13: 0000000000000008 R14: ff13ecb721800010 R15: ff13ecb721847008
[   89.927505] FS:  0000000000000000(0000) GS:ff13edb1fd000000(0000) knlGS:0000000000000000
[   89.936828] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.943513] CR2: 0000000000000010 CR3: 000001a8de63e001 CR4: 0000000000f73ef0
[   89.951761] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   89.960010] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[   89.968252] PKRU: 55555554
[   89.971516] Call Trace:
[   89.974487]  <TASK>
[   89.977067]  ? show_regs+0x6c/0x80
[   89.981115]  ? __die+0x24/0x80
[   89.984761]  ? page_fault_oops+0x175/0x5c0
[   89.989580]  ? vprintk_emit+0x127/0x400
[   89.994106]  ? do_user_addr_fault+0x4b2/0x870
[   89.999222]  ? exc_page_fault+0x82/0x1b0
[   90.003858]  ? asm_exc_page_fault+0x27/0x30
[   90.008804]  ? amdgpu_device_fini_hw+0x1f2/0x570 [amdgpu]
[   90.015539]  ? amdgpu_device_fini_hw+0x16d/0x570 [amdgpu]
[   90.022274]  ? blocking_notifier_chain_unregister+0x38/0x70
[   90.028773]  amdgpu_driver_unload_kms+0x4b/0x70 [amdgpu]
[   90.035419]  amdgpu_driver_load_kms+0x91/0xd0 [amdgpu]
[   90.041865]  amdgpu_pci_probe+0x1d1/0x650 [amdgpu]
[   90.047920]  local_pci_probe+0x44/0xb0
[   90.052388]  work_for_cpu_fn+0x17/0x30
[   90.056847]  process_one_work+0x178/0x3d0
[   90.061607]  worker_thread+0x2de/0x410
[   90.066066]  ? __pfx_worker_thread+0x10/0x10
[   90.071114]  kthread+0xe1/0x110
[   90.074891]  ? __pfx_kthread+0x10/0x10
[   90.079353]  ret_from_fork+0x44/0x70
[   90.083627]  ? __pfx_kthread+0x10/0x10
[   90.088085]  ret_from_fork_asm+0x1a/0x30
[   90.092742]  </TASK>
[   90.252277] ---[ end trace 0000000000000000 ]---

Signed-off-by: Jiang Liu <gerry@xxxxxxxxxxxxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
index 272954f6e476..d430fab12355 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
@@ -752,7 +752,7 @@ void amdgpu_xcp_cfg_sysfs_fini(struct amdgpu_device *adev)
 	struct amdgpu_xcp_cfg *xcp_cfg;
 	int i;
 
-	if (!adev->xcp_mgr)
+	if (!adev->xcp_mgr || !adev->xcp_mgr->xcp_cfg)
 		return;
 
 	xcp_cfg =  adev->xcp_mgr->xcp_cfg;
-- 
2.43.5




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux