Re: [RFC PATCH 00/10] drm/panthor: Add user submission

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/08/2024 11:40, Christian König wrote:
Am 28.08.24 um 19:25 schrieb Mihail Atanassov:
Hello all,

This series implements a mechanism to expose Mali CSF GPUs' queue
ringbuffers directly to userspace, along with paraphernalia to allow
userspace to control job synchronisation between the CPU and GPU.

The goal of these changes is to allow userspace to control work
submission to the FW/HW directly without kernel intervention in the
common case, thereby reducing context switching overhead. It also allows
for greater flexibility in the way work is enqueued in the ringbufs.
For example, the current kernel submit path only supports indirect
calls, which is inefficient for small command buffers. Userspace can
also skip unnecessary sync operations.

Question is how do you guarantee forward progress for fence signaling?

E.g. when are fences created and published? How do they signal?

Current XGS queue is built upon an instance of the DRM scheduler, and XGS jobs which can not complete immediately are assigned a fence (as one would expect). This proposal rely on the DRM scheduler timeout to ensure forward progress if user space have encoded a "bad" stream of commands.

PS: We have tried to consider error propagation in case of timeouts, but the implementation in this area is most likely somewhat missing ATM (not tested).

How are dependencies handled? How can the kernel suspend an userspace queue?

Mali FW will send IDLE interrupts when a group has become idle. This is already (mostly) handled in Panthor today. There is of course the race to handle then, between GPU IDLE and user space submitting new work.

I'm actually working on this part right now. As this patchset stands, it doesn't check in the RTPM suspend callback if user space have managed to submit more work in the timeframe between "IDLE" and RTPM suspend callback. I just need to correctly "unbind" the group/queues, unmap the IO pages used, and abort the RTPM suspend if I detect that user space have managed to submit more work.

How does memory management work in this case?

Not sure exactly what you refer to here. There has basically been no change to how we handle memory.

If you think of how GPU jobs and VM_BIND interacts, then the change here is that it is now the XGS job and VM_BIND which interact. XGS takes on the same duties as the GPU jobs with kernel submission (in this regard).

Actually, if you see the submission flow for GPU jobs and XGS jobs, you will find that they are virtually identical when it comes to setting up fences and dependencies. Same goes for VM_BIND jobs.

--
Regards,
Ketil

Regards,
Christian.


This is still a work-in-progress, there's an outstanding issue with
multiple processes using different submission flows triggering
scheduling bugs (e.g. the same group getting scheduled twice), but we'd
love to gather some feedback on the suitability of the approach in
general and see if there's a clear path to merging something like this
eventually.

I've also CCd AMD maintainers because they have in the past done
something similar[1], in case they want to chime in.

There are two uses of this new uAPI in Mesa, one in gallium/panfrost
(link TBD), and one in panvk [2].

The Gallium implementation is a naïve change just to switch the
submission model and exercise the new kernel code, and we don't plan
on pursuing this at this time.

The panvk driver changes are, however, a better representation of the
intent behind this new uAPI, so please consider that as the reference
userspace. It is still very much also a work in progress.

  * patch 1 adds all the uAPI changes;
  * patch 2 implements the GROUP_CREATE ioctl changes necessary to expose
    the required objects to userspace;
  * patch 3 maps the doorbell pages, similarly to how the user I/O page is
    mapped;
  * patch 4 implements GROUP_KICK, which lets userspace request an
    inactive group to be scheduled on the GPU;
  * patches 5 & 6 implement XGS queues, a way for userspace to
    synchronise GPU queue progress with DRM syncobjs;
  * patches 7 & 8 add notification mechanisms for user & kernel to signal
    changes to native GPU syncobjs.

[1] https://lore.kernel.org/amd-gfx/CADnq5_N61q_o+5WYUZsZ=qu7VmeXTFHQSxLwTco05gLzHaiswA@xxxxxxxxxxxxxx/t/#m116a36a598d8fad1329e053974ad37a4dc0f28ed [2] https://gitlab.freedesktop.org/larsivsi/mesa/-/commits/panvk-v10-usersubmit?ref_type=heads

Ketil Johnsen (7):
   drm/panthor: Add uAPI to submit from user space
   drm/panthor: Extend GROUP_CREATE for user submission
   drm/panthor: Map doorbell pages
   drm/panthor: Add GROUP_KICK ioctl
   drm/panthor: Factor out syncobj handling
   drm/panthor: Implement XGS queues
   drm/panthor: Add SYNC_UPDATE ioctl

Mihail Atanassov (1):
   drm/panthor: Add sync_update eventfd handling

  drivers/gpu/drm/panthor/Makefile          |   4 +-
  drivers/gpu/drm/panthor/panthor_device.c  |  66 ++-
  drivers/gpu/drm/panthor/panthor_device.h  |  35 +-
  drivers/gpu/drm/panthor/panthor_drv.c     | 233 +++++++-
  drivers/gpu/drm/panthor/panthor_fw.c      |   2 +-
  drivers/gpu/drm/panthor/panthor_sched.c   | 408 +++++++++-----
  drivers/gpu/drm/panthor/panthor_sched.h   |   8 +-
  drivers/gpu/drm/panthor/panthor_syncobj.c | 167 ++++++
  drivers/gpu/drm/panthor/panthor_syncobj.h |  27 +
  drivers/gpu/drm/panthor/panthor_xgs.c     | 638 ++++++++++++++++++++++
  drivers/gpu/drm/panthor/panthor_xgs.h     |  42 ++
  include/uapi/drm/panthor_drm.h            | 243 +++++++-
  12 files changed, 1696 insertions(+), 177 deletions(-)
  create mode 100644 drivers/gpu/drm/panthor/panthor_syncobj.c
  create mode 100644 drivers/gpu/drm/panthor/panthor_syncobj.h
  create mode 100644 drivers/gpu/drm/panthor/panthor_xgs.c
  create mode 100644 drivers/gpu/drm/panthor/panthor_xgs.h






[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux