[AMD Official Use Only - General]
Hello Bas,
Thanks for trying this out.
This could be due to the doorbell as you mentioned, Usermode
queue uses doorbell manager internally.
This week, we are planning to publis the latest libDRM sample
code which uses a doorbell object (instead of the doorbell hack
IOCTL), adapting to that should fix your problem in my opinion.
We have tested this full stack (libDRM test + Usermode queue +
doorbell manager) for 500+ consecutive runs, and it worked well
for us.
You can use this integrated kernel stack (1+2) from my gitlab to
build
your kernel:
https://gitlab.freedesktop.org/contactshashanksharma/userq-amdgpu/-/tr
ee/integrated-db-and-uq-v3 Please stay tuned for updated libDRM
changes with doorbell objects.
Regards
Shashank
-----Original Message-----
From: Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx>
Sent: 10 April 2023 02:37
To: Sharma, Shashank <Shashank.Sharma@xxxxxxx>
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>; Kuehling, Felix
<Felix.Kuehling@xxxxxxx>;
Koenig, Christian <Christian.Koenig@xxxxxxx>
Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
Hi Shashank,
I tried writing a program to experiment with usermode queues and
I found some weird behavior: The first run of the program works
as expected, while subsequent runs don't seem to do anything (and
I allocate everything in GTT, so it should be zero initialized
consistently). Is this a known issue?
The linked libdrm code for the uapi still does a doorbell ioctl
so it could very well be that I do the doorbell wrong (especially
since the ioctl implementation was never shared AFAICT?), but it
seems like the kernel submissions (i.e. write wptr in dwords to
the wptr va and to the doorbell). Is it possible to update the
test in libdrm?
Code: https://gitlab.freedesktop.org/bnieuwenhuizen/usermode-queue
Thanks,
Bas
On Wed, Mar 29, 2023 at 6:05 PM Shashank Sharma
<shashank.sharma@xxxxxxx> wrote:
This patch series introduces AMDGPU usermode queues for gfx
workloads.
Usermode queues is a method of GPU workload submission into the
graphics hardware without any interaction with kernel/DRM
schedulers.
In this method, a userspace graphics application can create its own
workqueue and submit it directly in the GPU HW.
The general idea of how this is supposed to work:
- The application creates the following GPU objetcs:
- A queue object to hold the workload packets.
- A read pointer object.
- A write pointer object.
- A doorbell page.
- The application picks a 32-bit offset in the doorbell page for
this queue.
- The application uses the usermode_queue_create IOCTL
introduced in
this patch, by passing the the GPU addresses of these
objects (read
ptr, write ptr, queue base address and 32-bit doorbell
offset from
the doorbell page)
- The kernel creates the queue and maps it in the HW.
- The application can start submitting the data in the queue as
soon as
the kernel IOCTL returns.
- After filling the workload data in the queue, the app must
write the
number of dwords added in the queue into the doorbell
offset, and the
GPU will start fetching the data.
libDRM changes for this series and a sample DRM test program can be
found in the MESA merge request here:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287
This patch series depends on the doorbell-manager changes, which
are
being reviewed here:
https://patchwork.freedesktop.org/series/115802/
Alex Deucher (1):
drm/amdgpu: UAPI for user queue management
Arvind Yadav (2):
drm/amdgpu: add new parameters in v11_struct
drm/amdgpu: map wptr BO into GART
Shashank Sharma (6):
drm/amdgpu: add usermode queue base code
drm/amdgpu: add new IOCTL for usermode queue
drm/amdgpu: create GFX-gen11 MQD for userqueue
drm/amdgpu: create context space for usermode queue
drm/amdgpu: map usermode queue into MES
drm/amdgpu: generate doorbell index for userqueue
drivers/gpu/drm/amd/amdgpu/Makefile | 3 +
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 298
++++++++++++++++++ .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c |
++++++++++++++++++ 230 ++++++++++++++
.../gpu/drm/amd/include/amdgpu_userqueue.h | 66 ++++
drivers/gpu/drm/amd/include/v11_structs.h | 16 +-
include/uapi/drm/amdgpu_drm.h | 55 ++++
9 files changed, 677 insertions(+), 9 deletions(-) create mode
100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
create mode 100644
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
create mode 100644
drivers/gpu/drm/amd/include/amdgpu_userqueue.h
--
2.40.0