In rare cases, a delayed destruction of a BO with a system resource could stay in the workqueue until drain_workqueue() is called in ttm_device_fini(). An attempt to free a resource from an already released manager results in NULL pointer dereference. Move the step of draining and destroying the workqueue so it happens before the ttm_sys_manager cleanup. Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers") Signed-off-by: Karolina Stolarek <karolina.stolarek@xxxxxxxxx> --- Some background: I stumbled upon this issue when testing ttm_bo_pipeline_gutting() with BO with an active dma_resv fence. In ~2% of the runs, the delayed destruction of the ghost wouldn't happen until the drain_queue() step. man->func->free(man, *res) got called via ttm_bo_cleanup_memtype_use(), the manager and its functions were nowhere to be seen, resulting in a nulptr deref. drivers/gpu/drm/ttm/ttm_device.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c index 7726a72befc5..753126581620 100644 --- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -232,6 +232,9 @@ void ttm_device_fini(struct ttm_device *bdev) struct ttm_resource_manager *man; unsigned i; + drain_workqueue(bdev->wq); + destroy_workqueue(bdev->wq); + man = ttm_manager_type(bdev, TTM_PL_SYSTEM); ttm_resource_manager_set_used(man, false); ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL); @@ -240,9 +243,6 @@ void ttm_device_fini(struct ttm_device *bdev) list_del(&bdev->device_list); mutex_unlock(&ttm_global_mutex); - drain_workqueue(bdev->wq); - destroy_workqueue(bdev->wq); - spin_lock(&bdev->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) if (list_empty(&man->lru[0])) -- 2.25.1