Re: [PATCH v3 1/9] drm/amdgpu: UAPI for user queue management

Shashank Sharma <shashank.sharma@xxxxxxx> · Thu, 30 Mar 2023 10:15:17 +0200

Hello Pierre-Eric,

Thanks for your review, my comments inline.


On 30/03/2023 10:02, Pierre-Eric Pelloux-Prayer wrote:
Hi Shashank,

On 29/03/2023 18:04, Shashank Sharma wrote:
From: Alex Deucher <alexander.deucher@xxxxxxx>

This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.

This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.

V2: Addressed review comments from Alex and Christian
     - Make the doorbell offset's comment clearer
     - Change the output parameter name to queue_id
V3: Integration with doorbell manager

Cc: Alex Deucher <alexander.deucher@xxxxxxx>
Cc: Christian Koenig <christian.koenig@xxxxxxx>
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx>
---
  include/uapi/drm/amdgpu_drm.h | 55 +++++++++++++++++++++++++++++++++++
  1 file changed, 55 insertions(+)

diff --git a/include/uapi/drm/amdgpu_drm.h 
b/include/uapi/drm/amdgpu_drm.h
index cc5d551abda5..e4943099b9d2 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
  #define DRM_AMDGPU_VM            0x13
  #define DRM_AMDGPU_FENCE_TO_HANDLE    0x14
  #define DRM_AMDGPU_SCHED        0x15
+#define DRM_AMDGPU_USERQ        0x16
    #define DRM_IOCTL_AMDGPU_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
  #define DRM_IOCTL_AMDGPU_GEM_MMAP    DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
  #define DRM_IOCTL_AMDGPU_VM        DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_VM, union drm_amdgpu_vm)
  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE 
+ DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
  #define DRM_IOCTL_AMDGPU_SCHED        DRM_IOW(DRM_COMMAND_BASE + 
DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_USERQ        DRM_IOW(DRM_COMMAND_BASE + 
DRM_AMDGPU_USERQ, union drm_amdgpu_userq)
    /**
   * DOC: memory domains
@@ -307,6 +309,59 @@ union drm_amdgpu_ctx {
      union drm_amdgpu_ctx_out out;
  };
  +/* user queue IOCTL */
+#define AMDGPU_USERQ_OP_CREATE    1
+#define AMDGPU_USERQ_OP_FREE    2
+
+#define AMDGPU_USERQ_MQD_FLAGS_SECURE    (1 << 0)
+#define AMDGPU_USERQ_MQD_FLAGS_AQL    (1 << 1)

What is the purpose of these flags?
Could you add some documentation?
Noted,

+
+struct drm_amdgpu_userq_mqd {
+    /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
+    __u32    flags;
+    /** IP type: AMDGPU_HW_IP_* */
+    __u32    ip_type;
+    /** GEM object handle */
+    __u32   doorbell_handle;
+    /** Doorbell's offset in the doorbell bo */
+    __u32   doorbell_offset;
+    /** GPU virtual address of the queue */
+    __u64   queue_va;
+    /** Size of the queue in bytes */
+    __u64   queue_size;
+    /** GPU virtual address of the rptr */
+    __u64   rptr_va;
+    /** GPU virtual address of the wptr */
+    __u64   wptr_va;
+    /** GPU virtual address of the shadow context space */
+    __u64    shadow_va;
+};

The comments inside drm_amdgpu_userq_mqd could be improved.
Here are some questions I have looking at the API:
* what is a doorbell in this context?
* what's the purpose of the doorbell offset?
* how UMD should size each buffer? I assume doorbell, rptr, wptr
  have min size requirements?
We are planning to add all these in a detailed doc in form of 
cover-letter/text comment in the follow-up libDRM UAPI version, as 
discussed internally.

I'm also wondering why the doorbell needs a handle+offset but
other buffers are passed in as virtual addresses?

As you know, doorbell offset here will be an relative offset in this 
doorbell page, but the MQD needs the absolute offset on the doorbell PCI 
BAR.

So kernel needs both the object as well as relative offset to calculate 
absolute offset.

something like: absolute offset = base offset of this doorbell page + 
relative offset of this doorbell.

+
+struct drm_amdgpu_userq_in {
+    /** AMDGPU_USERQ_OP_* */
+    __u32    op;
+    /** Flags */
+    __u32    flags;

What are these flags?

We have kept these flags to indicate special conditions like secure 
display, encryption etc. We will start utilizing these once we have the

base code (this series) merged up.


+    /** Queue handle to associate the queue free call with,
+     * unused for queue create calls */
+    __u32    queue_id;
+    __u32    pad;
+    /** Queue descriptor */
+    struct drm_amdgpu_userq_mqd mqd;
+};

I'm not familiar with ioctl design but would a union work to
identify the parameters of each operation?

union {
   struct {
      struct drm_amdgpu_userq_mqd mqd;
   } create;
   struct {
      __u32 queue_id;
      __u32 pad;
   } free;
};


I think it might work. I can check this out.


+
+struct drm_amdgpu_userq_out {
+    /** Queue handle */
+    __u32    queue_id;
+    /** Flags */
+    __u32    flags;

What are these flags?

These are not utilized yet. We have kept these flags to indicate special 
out conditions like over subscription/failure due to no more queue slot 
etc.

Noted to be covered in the doc as well.

- Shashank


Thanks,
Pierre-Eric

+};
+
+union drm_amdgpu_userq {
+    struct drm_amdgpu_userq_in in;
+    struct drm_amdgpu_userq_out out;
+};
+
  /* vm ioctl */
  #define AMDGPU_VM_OP_RESERVE_VMID    1
  #define AMDGPU_VM_OP_UNRESERVE_VMID    2