On 2023-11-06 5:40, ZhenGuo Yin wrote:
[Why] There will be a warning trace when cleaning up the gtt drm_mm allocator during unloading driver since gang_ctx_bo and wptr_bo do not get freed.
This isn't just a problem with module unloading, but a more general memory leak. pqm_uninit runs not during module unload, but during every ROCm process termination.
[How] Free gang_ctx_bo and wptr_bo in pqm_uninit. Signed-off-by: ZhenGuo Yin <zhenguo.yin@xxxxxxx> --- drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index 77649392e233..fdb03b08df72 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -179,6 +179,14 @@ void pqm_uninit(struct process_queue_manager *pqm) !pqn->q->device->kfd->shared_resources.enable_mes) amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info, pqn->q->gws); + + if (pqn->q->device->kfd->shared_resources.enable_mes) { + amdgpu_amdkfd_free_gtt_mem(pqn->q->device->kfd->adev, + pqn->q->gang_ctx_bo); + if (pqn->q->wptr_bo) + amdgpu_amdkfd_free_gtt_mem(pqn->q->device->kfd->adev, pqn->q->wptr_bo); + }
It looks like we're duplicating more and more code from pqm_destroy_queue here. I wonder if we should have a common helper function for freeing a queue's resources that could get used in both places.
Regards, Felix
+ kfd_procfs_del_queue(pqn->q); uninit_queue(pqn->q); list_del(&pqn->process_queue_list);