Dear Paul,
Patch edited:
[why]
lru_list not empty warning in sw fini during repeated device bind unbind.
There should be a amdgpu_fence_wait_empty() before the flush_delayed_work()
call as Christian suggested.
[how]
Move to do flush_delayed_work for ttm bo delayed delete wq after
fence_driver_hw_fini.
Signed-off-by: Yiqing Yao <yiqing.yao@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 14c5ccf81e80..92e5ed3ed345 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4003,10 +4003,6 @@ void amdgpu_device_fini_hw(struct amdgpu_device
*adev)
{
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(&adev->delayed_init_work);
- if (adev->mman.initialized) {
- flush_delayed_work(&adev->mman.bdev.wq);
- ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
- }
adev->shutdown = true;
/* make sure IB test finished before entering exclusive mode
@@ -4029,6 +4025,11 @@ void amdgpu_device_fini_hw(struct amdgpu_device
*adev)
}
amdgpu_fence_driver_hw_fini(adev);
+ if (adev->mman.initialized) {
+ flush_delayed_work(&adev->mman.bdev.wq);
+ ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
+ }
+
if (adev->pm_sysfs_en)
amdgpu_pm_sysfs_fini(adev);
if (adev->ucode_sysfs_en)
--
2.25.1
On 5/5/2022 3:15 PM, Paul Menzel wrote:
[how]
Do flush_delayed_work for ttm bo delayed delete wq after
fence_driver_hw_fini.
Signed-off-by: Yiqing Yao <yiqing.yao@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 14c5ccf81e80..92e5ed3ed345 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4003,10 +4003,6 @@ void amdgpu_device_fini_hw(struct
amdgpu_device *adev)
{
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(&adev->delayed_init_work);
- if (adev->mman.initialized) {
- flush_delayed_work(&adev->mman.bdev.wq);
- ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
- }
From the commit message, it’s not clear, that you remove this here.
This part is moved to be done later.
Thank you for advice,
Yiqing