Re: 回复: [PATCH] drm/amdgpu: Make sure ttm delayed work finished

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



That warning is a bit more than a little annoying.

Before we stop the delayed delete worker we *must* absolutely make sure that there is nothing going on the hardware any more. Otherwise we could easily run into use after free issues.

There should somewhere be a amdgpu_fence_wait_empty() before the flush_delayed_work() call. If that isn't there we do have a problem elsewhere.

Thanks for investigating this,
Christian.

Am 13.04.22 um 09:47 schrieb Pan, Xinhui:
[AMD Official Use Only]

The log from tester says it is the drm framebuffer BO being busy.

I just feel there is lack of time for its fence to be signaled.
As a delay works too in my test.
But the warning is a little annoying.

________________________________________
发件人: Koenig, Christian <Christian.Koenig@xxxxxxx>
发送时间: 2022年4月13日 15:30
收件人: Pan, Xinhui; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
抄送: Deucher, Alexander
主题: AW: [PATCH] drm/amdgpu: Make sure ttm delayed work finished

We don't need that.

TTM only reschedules when the BOs are still busy.

And if the BOs are still busy when you unload the driver we have much bigger problems that this TTM worker :)

Regards,
Christian

________________________________
Von: Pan, Xinhui <Xinhui.Pan@xxxxxxx>
Gesendet: Mittwoch, 13. April 2022 05:08
An: amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>
Betreff: [PATCH] drm/amdgpu: Make sure ttm delayed work finished

ttm_device_delayed_workqueue would reschedule itself if there is pending
BO to be destroyed. So just one flush + cancel_sync is not enough. We
still see lru_list not empty warnging.

Fix it by waiting all BO to be destroyed.

Signed-off-by: xinhui pan <xinhui.pan@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++--
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 6f47726f1765..e249923eb9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3957,11 +3957,17 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev)
   */
  void amdgpu_device_fini_hw(struct amdgpu_device *adev)
  {
+       int pending = 1;
+
          dev_info(adev->dev, "amdgpu: finishing device.\n");
          flush_delayed_work(&adev->delayed_init_work);
-       if (adev->mman.initialized) {
+       while (adev->mman.initialized && pending) {
                  flush_delayed_work(&adev->mman.bdev.wq);
-               ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
+               pending = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
+               if (pending) {
+                       ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, true);
+                       msleep((HZ / 100) < 1) ? 1 : HZ / 100);
+               }
          }
          adev->shutdown = true;

--
2.25.1





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux