[Public]
My BAD to misunderstand this.
There are both spell typos in patch subject and body, s/iff/if.
The patch is: Reviewed-by: Guchun Chen <guchun.chen@xxxxxxx>
Please wait for the ack from Andrey and Christian before pushing this.
Regards,
Guchun
-----Original Message-----
From: Shi, Leslie <Yuliang.Shi@xxxxxxx>
Sent: Thursday, December 16, 2021 3:00 PM
To: Chen, Guchun <Guchun.Chen@xxxxxxx>; Grodzovsky, Andrey
<Andrey.Grodzovsky@xxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>;
Deucher, Alexander <Alexander.Deucher@xxxxxxx>;
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio()
iff device is unplugged to prevent crash in GPU initialization failure
[Public]
Hi Guchun,
As Andrey says, "we should not call amdgpu_device_unmap_mmio unless
device is unplugged", I think we should call
amdgpu_device_unmap_mmio() only if device is unplugged
(drm_dev_enter() return false) .
+if (!drm_dev_enter(adev_to_drm(adev), &idx))
+ amdgpu_device_unmap_mmio(adev);
+ else
# drm_dev_exit(idx);
Regards,
Leslie
-----Original Message-----
From: Chen, Guchun <Guchun.Chen@xxxxxxx>
Sent: Thursday, December 16, 2021 2:46 PM
To: Shi, Leslie <Yuliang.Shi@xxxxxxx>; Grodzovsky, Andrey
<Andrey.Grodzovsky@xxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>;
Deucher, Alexander <Alexander.Deucher@xxxxxxx>;
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio()
iff device is unplugged to prevent crash in GPU initialization failure
[Public]
Hi Leslie,
I think we need to modify it like:
+if (drm_dev_enter(adev_to_drm(adev), &idx)) {
+ amdgpu_device_unmap_mmio(adev);
+ drm_dev_exit(idx);
+}
Also you need to credit Andrey a 'suggested-by' in your patch.
Regards,
Guchun
-----Original Message-----
From: Shi, Leslie <Yuliang.Shi@xxxxxxx>
Sent: Thursday, December 16, 2021 2:14 PM
To: Grodzovsky, Andrey <Andrey.Grodzovsky@xxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>;
Deucher, Alexander <Alexander.Deucher@xxxxxxx>;
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Chen, Guchun <Guchun.Chen@xxxxxxx>; Shi, Leslie
<Yuliang.Shi@xxxxxxx>
Subject: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff
device is unplugged to prevent crash in GPU initialization failure
[Why]
In amdgpu_driver_load_kms, when amdgpu_device_init returns error
during driver modprobe, it will start the error handle path
immediately and call into amdgpu_device_unmap_mmio as well to release
mapped VRAM. However, in the following release callback, driver
stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr
in vcn_v3_0_sw_fini. So a kernel crash occurs.
[How]
call amdgpu_device_unmap_mmio() iff device is unplugged to prevent
invalid memory address in
vcn_v3_0_sw_fini() when GPU initialization failure.
Signed-off-by: Leslie Shi <Yuliang.Shi@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index fb03d75880ec..d3656e7b60c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3845,6 +3845,8 @@ static void amdgpu_device_unmap_mmio(struct
amdgpu_device *adev)
*/
void amdgpu_device_fini_hw(struct amdgpu_device *adev) {
+ int idx;
+
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(&adev->delayed_init_work);
if (adev->mman.initialized) {
@@ -3888,7 +3890,11 @@ void amdgpu_device_fini_hw(struct
amdgpu_device *adev)
amdgpu_gart_dummy_page_fini(adev);
- amdgpu_device_unmap_mmio(adev);
+ if (!drm_dev_enter(adev_to_drm(adev), &idx))
+ amdgpu_device_unmap_mmio(adev);
+ else
+ drm_dev_exit(idx);
+
}
void amdgpu_device_fini_sw(struct amdgpu_device *adev)
--
2.25.1