Re: [PATCH V5 00/11] Add disable kernel queue support

Alex Deucher <alexdeucher@xxxxxxxxx> · Tue, 18 Mar 2025 16:06:01 -0400

On Tue, Mar 18, 2025 at 1:46 PM Rodrigo Siqueira <siqueira@xxxxxxxxxx> wrote:
>
> On 03/13, Alex Deucher wrote:
> > On Thu, Mar 13, 2025 at 6:21 PM Rodrigo Siqueira <siqueira@xxxxxxxxxx> wrote:
> > >
> > > n 03/13, Alex Deucher wrote:
> > > > To better evaluate user queues, add a module parameter
> > > > to disable kernel queues.  With this set kernel queues
> > > > are disabled and only user queues are available.  This
> > > > frees up hardware resources for use in user queues which
> > > > would otherwise be used by kernel queues and provides
> > > > a way to validate user queues without the presence
> > > > of kernel queues.
> > >
> > > Hi Alex,
> > >
> > > I'm trying to understand how GFX and MES deal with different queues, and
> > > I used this patchset to guide me through that. In this sense, could you
> > > help me with the following points?
> > >
> > > FWIU, the GFX has what are called pipes, which in turn have hardware
> > > queues associated with them. For example, a GFX can have 2 pipes, and
> > > each pipe could have 2 hardware queues; or it could have 1 pipe and 8
> > > queue. Is this correct?
> >
>
> Hi Alex, first of all, thanks a lot for your detailed explanation.
>
> I still have some other questions, see it inline.
>
> > Right.  For gfx, compute, and SDMA you have pipes (called instances on
> > SDMA) and queues.  A pipe can only execute one queue at a time.  The
>
> What is the difference between GFX and Compute? Tbh, I thought they were
> the same component.

They both share access to the shader cores, but they have different
front ends.  GFX has a bunch of fixed function blocks used by draws
while compute dispatches directly to the shaders.  There are separate
pipes for each.  You can send dispatch packets to GFX, but you can't
send draw packets to compute.

>
> I was also thinking about the concept of a pipe, and I'm trying to
> define what a pipe is in this context (the word pipe is one of those
> words with many meanings in computers). Is the below definition accurate
> enough?
>
>  Pipe, in the context of GFX, Compute, and SDMA, is a mechanism for
>  running threads.

Yes. It's the hardware that actually processes the packets in a queue.
You have multiple HQDs associated with a pipe, only one will be
processed by a pipe at a time.

>
> > pipe will switch between all of the mapped queues.  You have storage
>
> Above, you said that each pipe will switch between queues, and a little
> bit below, in your explanation about MES, you said:
>
>  [..] If there are more MQDs than HQDs, the MES firmware will preempt
>  other user queues to make sure each queue gets a time slice.
>
> Does it mean that the GFX Pipe has the mechanic of switching queues
> while MES has the scheduling logic?

The pipes have hardware logic to switch between the HQD slots.  MES is
a separate microcontroller which handles the mapping and unmapping of
MQDs into HQDs.  It handles priorities and oversubscription (more MQDs
than HQDs).

>
> Does the below example and explanation make sense?
>
> Suppose the following scenario:
>  - One pipe (pipe0) and two queues (queue[0], and queue[1]).
>  - 3 MQDs (mqd[0], mqd[1], and mqd[2]).
>  - pipe0 is running an user queue in queue[1].
>  - pipe0 is running a kernel queue in queue[0].

Yes.  A pipe can only execute one queue at a time, it will dynamically
switch between the active HQDs.

>
> Fwiu, a pipe can change the current queue in execution, but it does not
> do it by itself. In other words, it has no scheduling logic; it only has
> the mechanics of switching queues inside it. When the pipe switches
> between queues, it uses Mid Command Buffer Preemption (MCBP), which
> saves some very basic information but has no register state; in other
> words, those registers must be stored in memory (MES handles it?).

More or less.  A pipe will switch between queues on a command stream
or on a packet by packet basis, depending on the engine.  You can
preempt a queue if you want.  In general the driver will ask MES to do
this if it needs to preempt a queue.  The MES will also do this
internally for scheduling reasons.  MES firmware handles the saving of
state to the MQD.

>
> In turn, MES has access to all MQDs handed over to it, which means that
> MES has all the queue states available for the scheduling and
> communication with the GFX pipe. Suppose that the GFX pipe is running
> mqd[2] in the queue[1], and now MES wants to replace it with mqd[0]. The
> communication will be something like the following:
>
> 1. MES to GFX pipe0: Replace(mqd[2], in pipe0, queue[1]) with mqd[0].
> 2. GFX pipe0: Just stop the current pipe, and start mqd[0].
>
> Does it looks correct to you?

MES would talk to the hardware to unmap queue[1] and save its state to
mqd[2].  It would then talk to the hardware to map the state from
mqd[0] into queue[1].

>
> > in memory (called an MDQ -- Memory Queue Descriptor) which defines the
> > state of the queue (GPU virtual addresses of the queue itself, save
> > areas, doorbell, etc.).  The queues that the pipe switches between are
> > defined by HQDs (Hardware Queue Descriptors).  These are basically
> > register based memory for the queues that the pipe can switch between.
>
> I was thinking about this register-based memory part. Does it mean that
> switching between it is just a matter of updating one of those LOW and
> HIGH registers?

Not exactly, but close.  The HQD registers are saved in/out of the MQD
and the MQD also has pointers to other buffers which store other
things like pipeline state, etc.  Firmware basically tells the hw to
preempt or umap the queues, waits for that to complete (waits for
HQD_ACTIVE bit for the queue to go low), then saves the state to the
MQD.  For resuming or mapping a queue, the opposite happens, firmware
copies the state out of the MQD into the HQD registers and loads any
additional state.  Setting the HQD_ACTIVE bit for the queue is what
ultimately enables it.

>
> > The driver sets up an MQD for each queue that it creates.  The MQDs
> > are then handed to the MES firmware for mapping.  The MES firmware can
> > map a queue as a legacy queue (i.e. a kernel queue) or a user queue.
> > The difference is that a legacy queue is statically mapped to a HQD
> > and is never preempted.  User queues are dynamically mapped to the
> > HQDs by the MES firmware.  If there are more MQDs than HQDs, the MES
> > firmware will preempt other user queues to make sure each queue gets a
> > time slice.
> >
> > >
> > > (for this next part, suppose 1 pipe 2 hardware queues)
> > > By default, one of the hardware queues is reserved for the Kernel Queue,
> > > and the user space could use the other. GFX has the MES block "connected"
> > > to all pipe queues, and MES is responsible for scheduling different ring
> > > buffers (in memory) in the pipe's hardware queue (effectively making the
> > > ring active). However, since the kernel queue is always present, MES
> > > only performs scheduling in one of the hardware queues. This scheduling
> > > occurs with the MES mapping and unmapping available Rings in memory to
> > > the hardware queue.
> > >
> > > Does the above description sound correct to you?  How about the below
> > > diagram? Does it look correct to you?
> >
> > More or less.  The MES handles all of the queues (kernel or user).
> > The only real difference is that kernel queues are statically mapped
> > to an HQD while user queues are dynamically scheduled in the available
> > HQDs based on level of over-subscription.  E.g., if you have hardware
> > with 1 pipe and 2 HQDs you could have a kernel queue on 1 HQD and the
> > MES would schedule all of the user queues on the remaining 1 HQD.  If
> > you don't enable any kernel queues, then you have 2 HQDs that the MES
> > can use for scheduling user queues.
> >
> > >
> > > (I hope the diagram looks fine in your email client; if not, I can
> > > attach a picture of it.)
> > >
> > > +-------------------------------------------------------------------------------------------------------------------------------------------+
> > > |                                                           GFX                                                                             |
> > > |                                                                                                                                           |
> > > |                                                                               +-----------------------------+                             |
> > > |           +---------------------------------------------+ (Hw Queue 0)        | Kernel Queue (No eviction)  +------- No MES Scheduling    |
> > > |           |        (Hardware Queue 0)                   | ------------------->|                             |               |             |
> > > |PIPE 0     |   -------------------------------------     |                     +-----------------------------+               X             |
> > > |           |        (Hardware Queue 1)                   |                                                        +----------+---------+   |
> > > |           |   -------------------------------------     |--+                                                     |                    |   |
> > > |           |                                             |  |                  +----------------------------+     |                    |   |
> > > |           +---------------------------------------------+  | (Hw Queue 1)     |                            |     |   MES Schedules    |   |
> > > |                                                            +----------------> |  User Queue                +-----+                    |   |
> > > |                                                                               |                            |     |                    |   |
> > > |                                                                               +----------------------------+     |                    |   |
> > > |                                                                                                                  +--------------------+   |
> > > |                                                                                                                            |              |
> > > |                                                                                      +-------------------------------------+              |
> > > |                                                                                      |Un/Map Ring                                         |
> > > |                                                                                      |                                                    |
> > > +-------------------------------------------------------------------------------------------------------------------------------------------+
> > >                                                                                        |
> > >                                                                  +---------------------+--------------------------------------------+
> > >                                                                  |   MEMORY            v                                            |
> > >                                                                  |                                                                  |
> > >                                                                  |                                                                  |
> > >                                                                  |  +----------+                                                    |
> > >                                                                  |  |          |  +---------+         +--------+                    |
> > >                                                                  |  |    Ring 0|  | Ring 1  |  ...    | Ring N |                    |
> > >                                                                  |  |          |  |         |         |        |                    |
> > >                                                                  |  +----------+  +---------+         +--------+                    |
> > >                                                                  |                                                                  |
> > >                                                                  |                                                                  |
> > >                                                                  +------------------------------------------------------------------+
> > >
> > > Is the idea in this series to experiment with making the kernel queue
> > > not fully occupy one of the hardware queue? By making the kernel queue
> > > able to be scheduled, this would provide one extra queue to be used for
> > > other things. Is this correct?
> >
> > Right.  This series paves the way for getting rid of kernel queues all
> > together.  Having no kernel queues leaves all of the resources
> > available to user queues.
>
> Another question: I guess kernel queues use VMID 0, and all of the other
> user queues will use a different VMID, right? Does the VMID matter for
> this transition to make the kernel queue legacy?

vmid 0 is the GPU virtual address space used for all kernel driver
operations.  For kernel queues, the queue itself operates in the vmid
0 address space, but each command buffer (Indirect Buffer -- IB)
operates in a driver assigned non-0 vmid address space.  For kernel
queues, the driver manages the vmids.  For user queues, the queue and
IBs both operate in the user's non-0 vmid address space.  The MES
manages the vmids assignments for user queues.  The driver provides a
pointer to the user's GPU VM page tables and MES assigns a vmid when
it maps the queue.  Driver provides a mask which vmids the MES can use
so that there are no conflicts when mixing kernel and user queues.

Alex

>
> Thanks
>
> >
> > >
> > > I'm unsure if I fully understand this series's idea; please correct me
> > > if I'm wrong.
> > >
> > > Also, please elaborate more on the type of tasks that the kernel queue
> > > handles. Tbh, I did not fully understand the idea behind it.
> >
> > In the future of user queues, kernel queues would not be created or
> > used at all.  Today, on most existing hardware, kernel queues are all
> > that is available.  Today, when an application submits work to the
> > kernel driver, the kernel driver submits all of the application
> > command buffers to kernel queues.  E.g., in most cases there is a
> > single kernel GFX queue and all applications which want to use the GFX
> > engine funnel into that queue.  The CS IOCTL basically takes the
> > command buffers from the applications and schedules them on the kernel
> > queue.  With user queues, each application will create its own user
> > queues and will submit work directly to its user queues.  No need for
> > an IOCTL for each submission, no need to share a single kernel queue,
> > etc.
> >
> > Alex
> >
> > >
> > > Thanks
> > >
> > > >
> > > > v2: use num_gfx_rings and num_compute_rings per
> > > >     Felix suggestion
> > > > v3: include num_gfx_rings fix in amdgpu_gfx.c
> > > > v4: additional fixes
> > > > v5: MEC EOP interrupt handling fix (Sunil)
> > > >
> > > > Alex Deucher (11):
> > > >   drm/amdgpu: add parameter to disable kernel queues
> > > >   drm/amdgpu: add ring flag for no user submissions
> > > >   drm/amdgpu/gfx: add generic handling for disable_kq
> > > >   drm/amdgpu/mes: centralize gfx_hqd mask management
> > > >   drm/amdgpu/mes: update hqd masks when disable_kq is set
> > > >   drm/amdgpu/mes: make more vmids available when disable_kq=1
> > > >   drm/amdgpu/gfx11: add support for disable_kq
> > > >   drm/amdgpu/gfx12: add support for disable_kq
> > > >   drm/amdgpu/sdma: add flag for tracking disable_kq
> > > >   drm/amdgpu/sdma6: add support for disable_kq
> > > >   drm/amdgpu/sdma7: add support for disable_kq
> > > >
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu.h      |   1 +
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |   4 +
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |   9 ++
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  |   8 +-
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |   2 +
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  |  30 ++--
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c  |  26 ++-
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   2 +-
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |   1 +
> > > >  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c   | 191 ++++++++++++++++-------
> > > >  drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c   | 183 +++++++++++++++-------
> > > >  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |   2 +-
> > > >  drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c   |   2 +-
> > > >  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c   |  16 +-
> > > >  drivers/gpu/drm/amd/amdgpu/mes_v12_0.c   |  15 +-
> > > >  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c   |   4 +
> > > >  drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c   |   4 +
> > > >  17 files changed, 345 insertions(+), 155 deletions(-)
> > > >
> > > > --
> > > > 2.48.1
> > > >
> > >
> > > --
> > > Rodrigo Siqueira
>
> --
> Rodrigo Siqueira