Am 04.04.23 um 11:43 schrieb Tvrtko Ursulin:
On 04/04/2023 01:22, Matthew Brost wrote:
Hello,
As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
have been asked to merge our common DRM scheduler patches first as well
as develop a common solution for long running workloads with the DRM
scheduler. This RFC series is our first attempt at doing this. We
welcome any and all feedback.
This can we thought of as 4 parts detailed below.
- DRM scheduler changes for 1 to 1 relationship between scheduler and
entity (patches 1-3)
In Xe all of the scheduling of jobs is done by a firmware scheduler (the
GuC) which is a new paradigm WRT to the DRM scheduler and presents
severals problems as the DRM was originally designed to schedule jobs on
hardware queues. The main problem being that DRM scheduler expects the
submission order of jobs to be the completion order of jobs even across
multiple entities. This assumption falls apart with a firmware scheduler
as a firmware scheduler has no concept of jobs and jobs can complete out
of order. A novel solution for was originally thought of by Faith during
the initial prototype of Xe, create a 1 to 1 relationship between
scheduler
and entity. I believe the AGX driver [3] is using this approach and
Boris may use approach as well for the Mali driver [4].
To support a 1 to 1 relationship we move the main execution function
from a kthread to a work queue and add a new scheduling mode which
bypasses code in the DRM which isn't needed in a 1 to 1 relationship.
The new scheduling mode should unify all drivers usage with a 1 to 1
relationship and can be thought of as using scheduler as a dependency /
infligt job tracker rather than a true scheduler.
Once you add capability for a more proper 1:1 via
DRM_SCHED_POLICY_SINGLE_ENTITY, do you still have further need to
replace kthreads with a wq?
Yeah, I fail to see the need for that as well. On the other hand it
would be really nice to get rid of the rq/priority design in general.
Or in other words, what purpose does the offloading of a job picking
code to a separate execution context serve? Could it be done directly
in the 1:1 mode and leave kthread setup for N:M?
Well moving from kthread to work item is beneficial on it's own since
the later usually follows the the source of it's queue. E.g. when this
is triggered by an interrupt we run on the CPU of the interrupt and not
have inter CPU signaling.
Apart from those design level questions, low level open IMO still is
that default fallback of using the system_wq has the potential to
affect latency for other drivers. But that's for those driver owners
to approve.
Oh, yeah that's a good point as well. This needs some high priority queue.
Christian.
Regards,
Tvrtko
- Generic messaging interface for DRM scheduler
Idea is to be able to communicate to the submission backend with in band
(relative to main execution function) messages. Messages are backend
defined and flexable enough for any use case. In Xe we use these
messages to clean up entites, set properties for entites, and suspend /
resume execution of an entity [5]. I suspect other driver can leverage
this messaging concept too as it a convenient way to avoid races in the
backend.
- Support for using TDR for all error paths of a scheduler / entity
Fix a few races / bugs, add function to dynamically set the TDR timeout.
- Annotate dma-fences for long running workloads.
The idea here is to use dma-fences only as sync points within the
scheduler and never export them for long running workloads. By
annotating these fences as long running we ensure that these dma-fences
are never used in a way that breaks the dma-fence rules. A benefit of
thus approach is the scheduler can still safely flow control the
execution ring buffer via the job limit without breaking the dma-fence
rules.
Again this a first draft and looking forward to feedback.
Enjoy - Matt
[1] https://gitlab.freedesktop.org/drm/xe/kernel
[2] https://patchwork.freedesktop.org/series/112188/
[3] https://patchwork.freedesktop.org/series/114772/
[4] https://patchwork.freedesktop.org/patch/515854/?series=112188&rev=1
[5]
https://gitlab.freedesktop.org/drm/xe/kernel/-/blob/drm-xe-next/drivers/gpu/drm/xe/xe_guc_submit.c#L1031
Matthew Brost (8):
drm/sched: Convert drm scheduler to use a work queue rather than
kthread
drm/sched: Move schedule policy to scheduler / entity
drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
drm/sched: Add generic scheduler message interface
drm/sched: Start run wq before TDR in drm_sched_start
drm/sched: Submit job before starting TDR
drm/sched: Add helper to set TDR timeout
drm/syncobj: Warn on long running dma-fences
Thomas Hellström (2):
dma-buf/dma-fence: Introduce long-running completion fences
drm/sched: Support long-running sched entities
drivers/dma-buf/dma-fence.c | 142 +++++++---
drivers/dma-buf/dma-resv.c | 5 +
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 14 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +-
drivers/gpu/drm/drm_syncobj.c | 5 +-
drivers/gpu/drm/etnaviv/etnaviv_sched.c | 5 +-
drivers/gpu/drm/lima/lima_sched.c | 5 +-
drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +-
drivers/gpu/drm/msm/msm_ringbuffer.c | 5 +-
drivers/gpu/drm/panfrost/panfrost_job.c | 5 +-
drivers/gpu/drm/scheduler/sched_entity.c | 127 +++++++--
drivers/gpu/drm/scheduler/sched_fence.c | 6 +-
drivers/gpu/drm/scheduler/sched_main.c | 278 +++++++++++++++-----
drivers/gpu/drm/v3d/v3d_sched.c | 25 +-
include/drm/gpu_scheduler.h | 130 +++++++--
include/linux/dma-fence.h | 60 ++++-
16 files changed, 649 insertions(+), 184 deletions(-)