Re: [PATCH v2] drm/amdgpu: Make sure ttm delayed work finished

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear xinhui,


Am 13.04.22 um 08:46 schrieb xinhui pan:
ttm_device_delayed_workqueue would reschedule itself if there is pending
BO to be destroyed. So just one flush + cancel_sync is not enough. We
still see lru_list not empty warnging.

warning

(`scripts/checkpatch.pl --codespell` should have found this.)


Fix it by waiting all BO to be destroyed.

Please explain the waiting algorithm. Why at least one millisecond?

Do you have a reproducer for this issue?

Acked-by: Guchun Chen <guchun.chen@xxxxxxx>
Signed-off-by: xinhui pan <xinhui.pan@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++--
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 6f47726f1765..56dcf8c3a3cd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3957,11 +3957,17 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev)
   */
  void amdgpu_device_fini_hw(struct amdgpu_device *adev)
  {
+	int pending = 1;

Could this be bool?

+
  	dev_info(adev->dev, "amdgpu: finishing device.\n");
  	flush_delayed_work(&adev->delayed_init_work);
-	if (adev->mman.initialized) {
+	while (adev->mman.initialized && pending) {
  		flush_delayed_work(&adev->mman.bdev.wq);
-		ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
+		pending = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
+		if (pending) {
+			ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, true);

Does this call affect `adev->mman.initialized`? If not a do-while loop might be better suited, so pending is only checked once.

if (adev->mman.initialized) {
	do {
		flush_delayed_work(&adev->mman.bdev.wq);
		ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, true);
		msleep(((HZ / 100) < 1) ? 1 : HZ / 100);
	} while (ttm_bo_lock_delayed_workqueue(&adev->mman.bdev));
}

The logic is slightly different, as `ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, true);` and the sleep are run at least once, so the suggestion might be unsuitable.

+			msleep(((HZ / 100) < 1) ? 1 : HZ / 100);

Maybe add a comment for that formula? (No idea, if common knowledge, but why is a delay needed, and the loop cannot be run as fast as possible?)

+		}
  	}
  	adev->shutdown = true;


Kind regards,

Paul



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux