On 03/01/2025 07:34, Saleemkhan Jamadar wrote:
I see that your patch is removing the mapped doorbell index from the queue as well, whereas the mapped index of the doorbell is required to map a queue in the MES (see amdgpu_userqueue_map() function). So that's wrong interpretation of the code that its not being used anywhere.Hi Shashank, Replied inline [Saleem] Regards, Salem On 02/01/25 18:58, Sharma, Shashank wrote:[Saleem]:This assignment is not used anywhere, moreover it gets overwritten with mapped doorbell index below.+ (amd-gfx) On 01/01/2025 07:03, Saleemkhan Jamadar wrote:Nack, this is going to break mapping/unmapping for GFX UQ. Nothing should be removed from the queue structure, as all of it is accounted for.#resending patch From 79cd62f882197505dbf9c489ead2b0bcab98209f Mon Sep 17 00:00:00 2001 From: Saleemkhan Jamadar <saleemkhan.jamadar@xxxxxxx> Date: Wed, 18 Dec 2024 19:30:22 +0530 Subject: [PATCH] drm/amdgpu: user queue doorbell allocation for IP reqs Currenlty doorbell allocation handles only 64 bit db size. In case of VCN we need to allocated AGDB and per instance doorbell.VCN/UMSCH doorbell size is 32 bit and offset calculated is from specific range from the allocated page. With these changes individual IP can provide specific reqs for db allocation. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@xxxxxxx> ---drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 41 ++++++++++++++-----.../gpu/drm/amd/include/amdgpu_userqueue.h | 12 ++++++ 2 files changed, 42 insertions(+), 11 deletions(-)diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.cindex cba51bdf2e2c..4fff15e0d838 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c@@ -163,18 +163,17 @@ void amdgpu_userqueue_destroy_object(struct amdgpu_userq_mgr *uq_mgr,amdgpu_bo_unref(&userq_obj->obj); } -static uint64_t +uint64_t amdgpu_userqueue_get_doorbell_index(struct amdgpu_userq_mgr *uq_mgr, - struct amdgpu_usermode_queue *queue, - struct drm_file *filp, - uint32_t doorbell_offset) + struct amdgpu_db_info *db_info, + struct drm_file *filp) { uint64_t index; struct drm_gem_object *gobj; - struct amdgpu_userq_obj *db_obj = &queue->db_obj; - int r; + struct amdgpu_userq_obj *db_obj = db_info->db_obj; + int r, db_size; - gobj = drm_gem_object_lookup(filp, queue->doorbell_handle); + gobj = drm_gem_object_lookup(filp, db_info->doorbell_handle); if (gobj == NULL) { DRM_ERROR("Can't find GEM object for doorbell\n"); return -EINVAL;@@ -196,8 +195,23 @@ amdgpu_userqueue_get_doorbell_index(struct amdgpu_userq_mgr *uq_mgr,goto unpin_bo; } + switch (db_info->queue_type) { + case AMDGPU_HW_IP_GFX: + case AMDGPU_HW_IP_COMPUTE: + case AMDGPU_HW_IP_DMA: + db_size = sizeof(u64); + break; + case AMDGPU_HW_IP_VCN_ENC: + db_size = sizeof(u32);+ db_info->doorbell_offset += AMDGPU_NAVI10_DOORBELL64_VCN0_1 << 1;+ break; + default:+ DRM_WARN("User queues not supported for IP (%d )\n", db_info->queue_type);+ return -EINVAL; + } + index = amdgpu_doorbell_index_on_bar(uq_mgr->adev, db_obj->obj, - doorbell_offset, sizeof(u64)); + db_info->doorbell_offset, db_size);DRM_DEBUG_DRIVER("[Usermode queues] doorbell index=%lld\n", index);amdgpu_bo_unreserve(db_obj->obj); return index;@@ -242,6 +256,7 @@ amdgpu_userqueue_create(struct drm_file *filp, union drm_amdgpu_userq *args)struct amdgpu_device *adev = uq_mgr->adev; const struct amdgpu_userq_funcs *uq_funcs; struct amdgpu_usermode_queue *queue; + struct amdgpu_db_info db_info; uint64_t index; int qid, r = 0;@@ -269,19 +284,23 @@ amdgpu_userqueue_create(struct drm_file *filp, union drm_amdgpu_userq *args)goto unlock; } queue->doorbell_handle = args->in.doorbell_handle; - queue->doorbell_index = args->in.doorbell_offset;
I can see that all the information you are adding in struct db_info is coming from queue only, then why to add a new structure here at all ? Instead, you can just do this in function [Saleem]: In previous impl you can see that db_obj is fetched form queue structure. In case of vcn we have mulpitle doorbell to be allocated, to accomodate this and to avoid the another level of check to identify which doorbell is being allocated this db_info (structure used locally) helps. db_info shares data for which allocation needs to done.queue->queue_type = args->in.ip_type; queue->vm = &fpriv->vm; + db_info.queue_type = queue->queue_type; + db_info.doorbell_handle = queue->doorbell_handle; + db_info.db_obj = &queue->db_obj; + db_info.doorbell_offset = args->in.doorbell_offset; +
Nope, there is no difference in information available in the queue->* vs db_info->*, so its duplication of code for no apparent reason, so it doesn't make sense to use this here in the base/IP independent UQ file. Just update the doorbell offset as suggested before in the queue->* itself, and then if you want to have this structure, you can introduce it in the vcn specific implementation (vcn_db_info*) and move this code which you added here into the vcn_userqueueu* file.
- Shashank
amdgpu_userqueue_get_doorbell_index(): db_size = sizeof(u62); switch(queue->queue_type) { case VCN: queue->doorbell_offset += AMDGPU_NAVI10_DOORBELL64_VCN0_1 << 1; db_size = sizeof(u32); break; } [Saleem]: This change I can make. <same doorbell offset calculation as usual> - Shashank/* Convert relative doorbell offset into absolute doorbell index */ - index = amdgpu_userqueue_get_doorbell_index(uq_mgr, queue, filp, args->in.doorbell_offset); + index = amdgpu_userqueue_get_doorbell_index(uq_mgr, &db_info, filp);if (index == (uint64_t)-EINVAL) { DRM_ERROR("Failed to get doorbell for queue\n"); kfree(queue); goto unlock; } - queue->doorbell_index = index; + queue->doorbell_index = index; xa_init_flags(&queue->fence_drv_xa, XA_FLAGS_ALLOC); r = amdgpu_userq_fence_driver_alloc(adev, queue); if (r) {diff --git a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h b/drivers/gpu/drm/amd/include/amdgpu_userqueue.hindex 2bf28f3454cb..3d54601c6a24 100644 --- a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h +++ b/drivers/gpu/drm/amd/include/amdgpu_userqueue.h @@ -39,6 +39,13 @@ struct amdgpu_userq_obj { struct amdgpu_bo *obj; }; +struct amdgpu_db_info { + uint64_t doorbell_handle; + uint32_t queue_type; + uint32_t doorbell_offset; + struct amdgpu_userq_obj *db_obj; +}; + struct amdgpu_usermode_queue { int queue_type; uint8_t queue_active;@@ -93,4 +100,9 @@ void amdgpu_userqueue_destroy_object(struct amdgpu_userq_mgr *uq_mgr,void amdgpu_userqueue_suspend(struct amdgpu_userq_mgr *uq_mgr); int amdgpu_userqueue_active(struct amdgpu_userq_mgr *uq_mgr); ++uint64_t amdgpu_userqueue_get_doorbell_index(struct amdgpu_userq_mgr *uq_mgr,+ struct amdgpu_db_info *db_info, + struct drm_file *filp); + #endif