Re: [PATCH v11 00/28] AMDGPU usermode queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 19/09/2024 18:59, Alex Deucher wrote:
On Mon, Sep 9, 2024 at 4:07 PM Shashank Sharma <shashank.sharma@xxxxxxx> wrote:
This patch series introduces base code of AMDGPU usermode queues for gfx
workloads. Usermode queues is a method of GPU workload submission into the
graphics hardware without any interaction with kernel/DRM schedulers. In
this method, a userspace graphics application can create its own workqueue
and submit it directly in the GPU HW.

The general idea of how Userqueues are supposed to work:
- The application creates the following GPU objetcs:
   - A queue object to hold the workload packets.
   - A read pointer object.
   - A write pointer object.
   - A doorbell page.
   - Other supporting buffer objects as per target IP engine (shadow, GDS
     etc, information available with AMDGPU_INFO_IOCTL)
the queue, rptr, wptr, and metadata buffers don't have to be separate
buffers.  Userspace could suballocate them out of the same buffer.  We
just need the virtual addresses.  However, we need to keep track of
the GPU virtual addresses used by the user queue for these buffers and
prevent them from being unmapped until the queue is destroyed, similar
to what we do on the KFD side.  Otherwise, the user could unmap one of
the buffers and submit work to the user queue which could cause it to
hang.
Noted, thanks Alex.
Alex

- The application picks a 32-bit offset in the doorbell page for this
   queue.
- The application uses the usermode_queue_create IOCTL introduced in
   this patch, by passing the GPU addresses of these objects (read ptr,
   write ptr, queue base address, shadow, gds) with doorbell object and
   32-bit doorbell offset in the doorbell page.
- The kernel creates the queue and maps it in the HW.
- The application maps the GPU buffers in process address space.
- The application can start submitting the data in the queue as soon as
   the kernel IOCTL returns.
- After filling the workload data in the queue, the app must write the
   number of dwords added in the queue into the doorbell offset and the
   WPTR buffer. The GPU will start fetching the data as soon as its done.
- This series adds usermode queue support for all three MES based IPs
   (GFX, SDMA and Compute).
- This series also adds eviction fences to handle migration of the
   userqueue mapped buffers by TTM.
- For synchronization of userqueues, we have added a secure semaphores
   IOCTL which is getting reviewed separately here:
   https://patchwork.freedesktop.org/patch/611971/

libDRM UAPI changes for this series can be found here:
(This also contains an example test utility which demonstrates
the usage of userqueue UAPI)
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287

MESA changes consuming this series can be seen in the MR here:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29010

Alex Deucher (1):
   drm/amdgpu: UAPI for user queue management

Arvind Yadav (4):
   drm/amdgpu: enable SDMA usermode queues
   drm/amdgpu: Add input fence to sync bo unmap
   drm/amdgpu: fix MES GFX mask
   Revert "drm/amdgpu: don't allow userspace to create a doorbell BO"

Shashank Sharma (18):
   drm/amdgpu: add usermode queue base code
   drm/amdgpu: add new IOCTL for usermode queue
   drm/amdgpu: add helpers to create userqueue object
   drm/amdgpu: create MES-V11 usermode queue for GFX
   drm/amdgpu: create context space for usermode queue
   drm/amdgpu: map usermode queue into MES
   drm/amdgpu: map wptr BO into GART
   drm/amdgpu: generate doorbell index for userqueue
   drm/amdgpu: cleanup leftover queues
   drm/amdgpu: enable GFX-V11 userqueue support
   drm/amdgpu: enable compute/gfx usermode queue
   drm/amdgpu: update userqueue BOs and PDs
   drm/amdgpu: add kernel config for gfx-userqueue
   drm/amdgpu: add gfx eviction fence helpers
   drm/amdgpu: add userqueue suspend/resume functions
   drm/amdgpu: suspend gfx userqueues
   drm/amdgpu: resume gfx userqueues
   Revert "drm/amdgpu/gfx11: only enable CP GFX shadowing on SR-IOV"

  drivers/gpu/drm/amd/amdgpu/Kconfig            |   8 +
  drivers/gpu/drm/amd/amdgpu/Makefile           |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  11 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |   5 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  10 +
  .../drm/amd/amdgpu/amdgpu_eviction_fence.c    | 297 ++++++++
  .../drm/amd/amdgpu/amdgpu_eviction_fence.h    |  67 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  68 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       |  11 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c       |   3 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h       |   2 +-
  .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c   | 713 ++++++++++++++++++
  .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h   |  74 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 644 ++++++++++++++++
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c        |  42 +-
  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c        |  16 +-
  .../gpu/drm/amd/amdgpu/mes_v11_0_userqueue.c  | 395 ++++++++++
  .../gpu/drm/amd/amdgpu/mes_v11_0_userqueue.h  |  30 +
  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c        |   5 +
  .../gpu/drm/amd/include/amdgpu_userqueue.h    | 100 +++
  drivers/gpu/drm/amd/include/v11_structs.h     |   4 +-
  include/uapi/drm/amdgpu_drm.h                 | 252 +++++++
  22 files changed, 2722 insertions(+), 45 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.h
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/mes_v11_0_userqueue.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/mes_v11_0_userqueue.h
  create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h

--
2.45.1




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux