Re: [PATCH] drm/ttm: Drain workqueue before sys manager release

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 13.10.23 um 16:34 schrieb Karolina Stolarek:
In rare cases, a delayed destruction of a BO with a system resource
could stay in the workqueue until drain_workqueue() is called
in ttm_device_fini(). An attempt to free a resource from an already
released manager results in NULL pointer dereference. Move the step
of draining and destroying the workqueue so it happens before
the ttm_sys_manager cleanup.

Good catch! But it's not only the drainage of the workqueue which is at the wrong place here.

Removing the device from the device list should come before destroying the system domain as well.

So I think you should probably rather move the chunk:

        man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
        ttm_resource_manager_set_used(man, false);
        ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);

after the destroy_workqueue() instead of the other way around.

Apart from that looks good to me.

Thanks,
Christian.


Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
Signed-off-by: Karolina Stolarek <karolina.stolarek@xxxxxxxxx>
---
Some background: I stumbled upon this issue when testing
ttm_bo_pipeline_gutting() with BO with an active dma_resv fence. In ~2% of
the runs, the delayed destruction of the ghost wouldn't happen until the
drain_queue() step. man->func->free(man, *res) got called via
ttm_bo_cleanup_memtype_use(), the manager and its functions were nowhere to
be seen, resulting in a nulptr deref.

  drivers/gpu/drm/ttm/ttm_device.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 7726a72befc5..753126581620 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -232,6 +232,9 @@ void ttm_device_fini(struct ttm_device *bdev)
  	struct ttm_resource_manager *man;
  	unsigned i;
+ drain_workqueue(bdev->wq);
+	destroy_workqueue(bdev->wq);
+
  	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
  	ttm_resource_manager_set_used(man, false);
  	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
@@ -240,9 +243,6 @@ void ttm_device_fini(struct ttm_device *bdev)
  	list_del(&bdev->device_list);
  	mutex_unlock(&ttm_global_mutex);
- drain_workqueue(bdev->wq);
-	destroy_workqueue(bdev->wq);
-
  	spin_lock(&bdev->lru_lock);
  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
  		if (list_empty(&man->lru[0]))




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux