Re: [PATCH v4 02/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 6 Oct 2023 08:59:15 +0100

On 05/10/2023 05:13, Luben Tuikov wrote:
On 2023-10-04 23:33, Matthew Brost wrote:
On Tue, Sep 26, 2023 at 11:32:10PM -0400, Luben Tuikov wrote:
Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In XE the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in XE we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In XE submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
   - (Rob Clark) Fix msm build
   - Pass in run work queue
v3:
   - (Boris) don't have loop in worker
v4:
   - (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
   - (Boris) default to ordered work queue

Signed-off-by: Matthew Brost <matthew.brost@xxxxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
  drivers/gpu/drm/etnaviv/etnaviv_sched.c    |   2 +-
  drivers/gpu/drm/lima/lima_sched.c          |   2 +-
  drivers/gpu/drm/msm/msm_ringbuffer.c       |   2 +-
  drivers/gpu/drm/nouveau/nouveau_sched.c    |   2 +-
  drivers/gpu/drm/panfrost/panfrost_job.c    |   2 +-
  drivers/gpu/drm/scheduler/sched_main.c     | 118 ++++++++++-----------
  drivers/gpu/drm/v3d/v3d_sched.c            |  10 +-
  include/drm/gpu_scheduler.h                |  14 ++-
  9 files changed, 79 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e366f61c3aed..16f3cfe1574a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
  			break;
  		}
  
-		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
+		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
  				   ring->num_hw_submission, 0,
  				   timeout, adev->reset_domain->wq,
  				   ring->sched_score, ring->name,
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 345fec6cb1a4..618a804ddc34 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
  {
  	int ret;
  
-	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,
+	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
  			     etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
  			     msecs_to_jiffies(500), NULL, NULL,
  			     dev_name(gpu->dev), gpu->dev);
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ffd91a5ee299..8d858aed0e56 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name)
  
  	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
  
-	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,
+	return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
  			      lima_job_hang_limit,
  			      msecs_to_jiffies(timeout), NULL,
  			      NULL, name, pipe->ldev->dev);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 40c0bc35a44c..b8865e61b40f 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
  	 /* currently managing hangcheck ourselves: */
  	sched_timeout = MAX_SCHEDULE_TIMEOUT;
  
-	ret = drm_sched_init(&ring->sched, &msm_sched_ops,
+	ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
  			num_hw_submissions, 0, sched_timeout,
  			NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);

checkpatch.pl complains here about unmatched open parens.


Will fix and run checkpatch before posting next rev.

  	if (ret) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 88217185e0f3..d458c2227d4f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -429,7 +429,7 @@ int nouveau_sched_init(struct nouveau_drm *drm)
  	if (!drm->sched_wq)
  		return -ENOMEM;
  
-	return drm_sched_init(sched, &nouveau_sched_ops,
+	return drm_sched_init(sched, &nouveau_sched_ops, NULL,
  			      NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
  			      NULL, NULL, "nouveau_sched", drm->dev->dev);
  }
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 033f5e684707..326ca1ddf1d7 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -831,7 +831,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
  		js->queue[j].fence_context = dma_fence_context_alloc(1);
  
  		ret = drm_sched_init(&js->queue[j].sched,
-				     &panfrost_sched_ops,
+				     &panfrost_sched_ops, NULL,
  				     nentries, 0,
  				     msecs_to_jiffies(JOB_TIMEOUT_MS),
  				     pfdev->reset.wq,
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index e4fa62abca41..ee6281942e36 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,7 +48,6 @@
   * through the jobs entity pointer.
   */
  
-#include <linux/kthread.h>
  #include <linux/wait.h>
  #include <linux/sched.h>
  #include <linux/completion.h>
@@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
  	return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL;
  }
  
+/**
+ * drm_sched_submit_queue - scheduler queue submission

There is no verb in the description, and is not clear what
this function does unless one reads the code. Given that this
is DOC, this should be clearer here. Something like "queue
scheduler work to be executed" or something to that effect.


Will fix.
  
Coming back to this from reading the patch below, it was somewhat
unclear what "drm_sched_submit_queue()" does, since when reading
below, "submit" was being read by my mind as an adjective, as opposed
to a verb. Perhaps something like:

drm_sched_queue_submit(), or
drm_sched_queue_exec(), or
drm_sched_queue_push(), or something to that effect. You pick. :-)


I prefer the name as is. In this patch we have:

drm_sched_submit_queue()
drm_sched_submit_start)
drm_sched_submit_stop()
drm_sched_submit_ready()

I like all these functions start with 'drm_sched_submit' which allows
for easy searching for the functions that touch the DRM scheduler
submission state.

With a little better doc are you fine with the names as is.

Notice the following scheme in the naming,

drm_sched_submit_queue()
drm_sched_submit_start)
drm_sched_submit_stop()
drm_sched_submit_ready()
\---+---/ \--+-/ \-+-/
     |        |     +---> a verb
     |        +---------> should be a noun (something in the component)
     +------------------> the kernel/software component

And although "queue" can technically be used as a verb too, I'd rather it be "enqueue",
like this:

drm_sched_submit_enqueue()

And using "submit" as the noun of the component is a bit cringy,
since "submit" is really a verb, and it's cringy to make it a "state"
or an "object" we operate on in the DRM Scheduler. "Submission" is
a noun, but "submission enqueue/start/stop/ready" doesn't sound
very well thought out. "Submission" really is what the work-queue
does.

I'd rather it be a real object, like for instance,

drm_sched_wqueue_enqueue()
drm_sched_wqueue_start)
drm_sched_wqueue_stop()
drm_sched_wqueue_ready()

Which tells me that the component is the DRM Scheduler, the object is a/the work-queue,
and the last word as the verb, is the action we're performing on the object, i.e. the work-queue.
Plus, all these functions actually do operate on work-queues, directly or indirectly,
are new, so it's a win-win naming scheme.

I think that that would be most likeable.

FWIW I was suggesting not to encode the fact submit queue is implemented 
with a workqueue in the API name. IMO it would be nicer and less 
maintenance churn should something channge if the external components 
can be isolated from that detail.

drm_sched_submit_queue_$verb? If not viewed as too verbose...

Regards,

Tvrtko