On 2021-09-17 1:53 p.m., Zack Rusin wrote:
On some hardware, in particular in virtualized environments, the
system memory can be shared with the "hardware". In those cases
the BO's allocated through the ttm system manager might be
busy during ttm_bo_put which results in them being scheduled
for a delayed deletion.
The problem is that that the ttm system manager is disabled
before the final delayed deletion is ran in ttm_device_fini.
This results in crashes during freeing of the BO resources
because they're trying to remove themselves from a no longer
existent ttm_resource_manager (e.g. in IGT's core_hotunplug
on vmwgfx)
In general reloading any driver that could share system mem
resources with "hardware" could hit it because nothing
prevents the system mem resources from being scheduled
for delayed deletion (apart from them not being busy probably
anywhere apart from virtualized environments).
Signed-off-by: Zack Rusin <zackr@xxxxxxxxxx>
Cc: Christian Koenig <christian.koenig@xxxxxxx>
Cc: Huang Rui <ray.huang@xxxxxxx>
Cc: David Airlie <airlied@xxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
---
drivers/gpu/drm/ttm/ttm_device.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 9eb8f54b66fc..4ef19cafc755 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -225,10 +225,6 @@ void ttm_device_fini(struct ttm_device *bdev)
struct ttm_resource_manager *man;
unsigned i;
- man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
- ttm_resource_manager_set_used(man, false);
- ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
-
mutex_lock(&ttm_global_mutex);
list_del(&bdev->device_list);
mutex_unlock(&ttm_global_mutex);
@@ -238,6 +234,10 @@ void ttm_device_fini(struct ttm_device *bdev)
if (ttm_bo_delayed_delete(bdev, true))
pr_debug("Delayed destroy list was clean\n");
+ man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
+ ttm_resource_manager_set_used(man, false);
+ ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
+
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
Andrey
spin_lock(&bdev->lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
if (list_empty(&man->lru[0]))