On 25/04/2023 17:33, Christian König wrote:
Am 25.04.23 um 15:31 schrieb Shashank Sharma:On 25/04/2023 14:34, Christian König wrote:Both start and end are required by MES FW for mapping the MQD. The value here is generated same as in KFD userqueue.Am 24.04.23 um 19:38 schrieb Shashank Sharma:This patch adds new functions to map/unmap a usermode queue into the FW, using the MES ring. As soon as this mapping is done, the queue would be considered ready to accept the workload. V1: Addressed review comments from Alex on the RFC patch series - Map/Unmap should be IP specific. V2: Addressed review comments from Christian: - Fix the wptr_mc_addr calculation (moved into another patch) Addressed review comments from Alex: - Do not add fptrs for map/unmap V3: Integration with doorbell manager V4: Rebase Cc: Alex Deucher <alexander.deucher@xxxxxxx> Cc: Christian Koenig <christian.koenig@xxxxxxx> Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx> ---drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 67 ++++++++++++++++++++++++++1 file changed, 67 insertions(+)diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.cindex 86de35292f69..7a45d68091ec 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -31,6 +31,7 @@ #include "amdgpu_smu.h" #include "amdgpu_atomfirmware.h" #include "amdgpu_userqueue.h" +#include "amdgpu_mes.h" #include "imu_v11_0.h" #include "soc21.h" #include "nvd.h"@@ -6411,6 +6412,64 @@ const struct amdgpu_ip_block_version gfx_v11_0_ip_block =.funcs = &gfx_v11_0_ip_funcs, }; +static int gfx_v11_userq_map(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue) +{ + struct amdgpu_device *adev = uq_mgr->adev; + struct mes_add_queue_input queue_input; + int r; + + memset(&queue_input, 0x0, sizeof(struct mes_add_queue_input)); + + queue_input.process_va_start = 0;+ queue_input.process_va_end = (adev->vm_manager.max_pfn - 1) << AMDGPU_GPU_PAGE_SHIFT;Could be that this is wrong. What is that value used for?Ok in this case then that's probably the values which will be written into the registers. So that's probably ok.+ queue_input.process_quantum = 100000; /* 10ms */ + queue_input.gang_quantum = 10000; /* 1ms */ + queue_input.paging = false; + + queue_input.gang_context_addr = queue->gang_ctx_gpu_addr; + queue_input.process_context_addr = queue->proc_ctx_gpu_addr;+ queue_input.inprocess_gang_priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL; + queue_input.gang_global_priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;+ + queue_input.process_id = queue->vm->pasid; + queue_input.queue_type = queue->queue_type; + queue_input.mqd_addr = queue->mqd.gpu_addr; + queue_input.wptr_addr = queue->userq_prop.wptr_gpu_addr; + queue_input.queue_size = queue->userq_prop.queue_size >> 2; + queue_input.doorbell_offset = queue->userq_prop.doorbell_index;+ queue_input.page_table_base_addr = amdgpu_gmc_pd_addr(queue->vm->root.bo);You need to implement some mechanism to keep those page tables and all resources in place.I don't get it, would you mind elaborating this please ?We need something like the KFD eviction fence. Probably best if we have a call on that.
Noted, - Shashank
Christian.- ShashankChristian.+ + amdgpu_mes_lock(&adev->mes); + r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input); + amdgpu_mes_unlock(&adev->mes); + if (r) { + DRM_ERROR("Failed to map queue in HW, err (%d)\n", r); + return r; + } ++ DRM_DEBUG_DRIVER("Queue %d mapped successfully\n", queue->queue_id);+ return 0; +} + +static void gfx_v11_userq_unmap(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue) +{ + struct amdgpu_device *adev = uq_mgr->adev; + struct mes_remove_queue_input queue_input; + int r; + + memset(&queue_input, 0x0, sizeof(struct mes_remove_queue_input)); + queue_input.doorbell_offset = queue->userq_prop.doorbell_index; + queue_input.gang_context_addr = queue->gang_ctx_gpu_addr; + + amdgpu_mes_lock(&adev->mes); + r = adev->mes.funcs->remove_hw_queue(&adev->mes, &queue_input); + amdgpu_mes_unlock(&adev->mes); + if (r) + DRM_ERROR("Failed to unmap queue in HW, err (%d)\n", r); +} +static void gfx_v11_userq_set_ctx_space(struct amdgpu_userq_mgr *uq_mgr,struct amdgpu_usermode_queue *queue) {@@ -6507,6 +6566,13 @@ gfx_v11_userq_mqd_create(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermodegfx_v11_userq_set_ctx_space(uq_mgr, queue); amdgpu_bo_unreserve(mqd->obj); + + /* Map the queue in HW using MES ring */ + r = gfx_v11_userq_map(uq_mgr, queue); + if (r) { + DRM_ERROR("Failed to map userqueue (%d)\n", r); + goto free_ctx; + } DRM_DEBUG_DRIVER("MQD for queue %d created\n", queue->queue_id); return 0;@@ -6523,6 +6589,7 @@ gfx_v11_userq_mqd_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermod{ struct amdgpu_userq_ctx_space *mqd = &queue->mqd; + gfx_v11_userq_unmap(uq_mgr, queue); gfx_v11_userq_destroy_ctx_space(uq_mgr, queue); amdgpu_bo_free_kernel(&mqd->obj, &mqd->gpu_addr, &mqd->cpu_ptr); }