Re: [PATCH v4 07/10] drm/amdgpu: map usermode queue into MES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 25/04/2023 17:33, Christian König wrote:
Am 25.04.23 um 15:31 schrieb Shashank Sharma:

On 25/04/2023 14:34, Christian König wrote:
Am 24.04.23 um 19:38 schrieb Shashank Sharma:
This patch adds new functions to map/unmap a usermode queue into
the FW, using the MES ring. As soon as this mapping is done, the
queue would  be considered ready to accept the workload.

V1: Addressed review comments from Alex on the RFC patch series
     - Map/Unmap should be IP specific.
V2:
     Addressed review comments from Christian:
     - Fix the wptr_mc_addr calculation (moved into another patch)
     Addressed review comments from Alex:
     - Do not add fptrs for map/unmap

V3: Integration with doorbell manager
V4: Rebase

Cc: Alex Deucher <alexander.deucher@xxxxxxx>
Cc: Christian Koenig <christian.koenig@xxxxxxx>
Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 67 ++++++++++++++++++++++++++
  1 file changed, 67 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 86de35292f69..7a45d68091ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -31,6 +31,7 @@
  #include "amdgpu_smu.h"
  #include "amdgpu_atomfirmware.h"
  #include "amdgpu_userqueue.h"
+#include "amdgpu_mes.h"
  #include "imu_v11_0.h"
  #include "soc21.h"
  #include "nvd.h"
@@ -6411,6 +6412,64 @@ const struct amdgpu_ip_block_version gfx_v11_0_ip_block =
      .funcs = &gfx_v11_0_ip_funcs,
  };
  +static int gfx_v11_userq_map(struct amdgpu_userq_mgr *uq_mgr,
+                 struct amdgpu_usermode_queue *queue)
+{
+    struct amdgpu_device *adev = uq_mgr->adev;
+    struct mes_add_queue_input queue_input;
+    int r;
+
+    memset(&queue_input, 0x0, sizeof(struct mes_add_queue_input));
+
+    queue_input.process_va_start = 0;
+    queue_input.process_va_end = (adev->vm_manager.max_pfn - 1) << AMDGPU_GPU_PAGE_SHIFT;

Could be that this is wrong. What is that value used for?

Both start and end are required by MES FW for mapping the MQD. The value here is generated same as in KFD userqueue.

Ok in this case then that's probably the values which will be written into the registers. So that's probably ok.

+    queue_input.process_quantum = 100000; /* 10ms */
+    queue_input.gang_quantum = 10000; /* 1ms */
+    queue_input.paging = false;
+
+    queue_input.gang_context_addr = queue->gang_ctx_gpu_addr;
+    queue_input.process_context_addr = queue->proc_ctx_gpu_addr;
+    queue_input.inprocess_gang_priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL; +    queue_input.gang_global_priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+
+    queue_input.process_id = queue->vm->pasid;
+    queue_input.queue_type = queue->queue_type;
+    queue_input.mqd_addr = queue->mqd.gpu_addr;
+    queue_input.wptr_addr = queue->userq_prop.wptr_gpu_addr;
+    queue_input.queue_size = queue->userq_prop.queue_size >> 2;
+    queue_input.doorbell_offset = queue->userq_prop.doorbell_index;
+    queue_input.page_table_base_addr = amdgpu_gmc_pd_addr(queue->vm->root.bo);

You need to implement some mechanism to keep those page tables and all resources in place.

I don't get it, would you mind elaborating this please ?

We need something like the KFD eviction fence. Probably best if we have a call on that.

Noted,

- Shashank


Christian.


- Shashank


Christian.

+
+    amdgpu_mes_lock(&adev->mes);
+    r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input);
+    amdgpu_mes_unlock(&adev->mes);
+    if (r) {
+        DRM_ERROR("Failed to map queue in HW, err (%d)\n", r);
+        return r;
+    }
+
+    DRM_DEBUG_DRIVER("Queue %d mapped successfully\n", queue->queue_id);
+    return 0;
+}
+
+static void gfx_v11_userq_unmap(struct amdgpu_userq_mgr *uq_mgr,
+                struct amdgpu_usermode_queue *queue)
+{
+    struct amdgpu_device *adev = uq_mgr->adev;
+    struct mes_remove_queue_input queue_input;
+    int r;
+
+    memset(&queue_input, 0x0, sizeof(struct mes_remove_queue_input));
+    queue_input.doorbell_offset = queue->userq_prop.doorbell_index;
+    queue_input.gang_context_addr = queue->gang_ctx_gpu_addr;
+
+    amdgpu_mes_lock(&adev->mes);
+    r = adev->mes.funcs->remove_hw_queue(&adev->mes, &queue_input);
+    amdgpu_mes_unlock(&adev->mes);
+    if (r)
+        DRM_ERROR("Failed to unmap queue in HW, err (%d)\n", r);
+}
+
  static void gfx_v11_userq_set_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
                      struct amdgpu_usermode_queue *queue)
  {
@@ -6507,6 +6566,13 @@ gfx_v11_userq_mqd_create(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermode
        gfx_v11_userq_set_ctx_space(uq_mgr, queue);
      amdgpu_bo_unreserve(mqd->obj);
+
+    /* Map the queue in HW using MES ring */
+    r = gfx_v11_userq_map(uq_mgr, queue);
+    if (r) {
+        DRM_ERROR("Failed to map userqueue (%d)\n", r);
+        goto free_ctx;
+    }
      DRM_DEBUG_DRIVER("MQD for queue %d created\n", queue->queue_id);
      return 0;
  @@ -6523,6 +6589,7 @@ gfx_v11_userq_mqd_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermod
  {
      struct amdgpu_userq_ctx_space *mqd = &queue->mqd;
  +    gfx_v11_userq_unmap(uq_mgr, queue);
      gfx_v11_userq_destroy_ctx_space(uq_mgr, queue);
      amdgpu_bo_free_kernel(&mqd->obj, &mqd->gpu_addr, &mqd->cpu_ptr);
  }





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux