Re: [PATCH RFC 10/18] drm/scheduler: Add can_run_job callback

Christian König <christian.koenig@xxxxxxx> · Wed, 5 Apr 2023 16:14:11 +0200

Am 05.04.23 um 15:40 schrieb Daniel Vetter:
On Tue, Mar 07, 2023 at 11:25:35PM +0900, Asahi Lina wrote:
Some hardware may require more complex resource utilization accounting
than the simple job count supported by drm_sched internally. Add a
can_run_job callback to allow drivers to implement more logic before
deciding whether to run a GPU job.

Signed-off-by: Asahi Lina <lina@xxxxxxxxxxxxx>
Ok scheduler rules, or trying to summarize the entire discussion:

dma_fence rules are very tricky. The two main chapters in the docs are

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_buf#dma-fence-cross-driver-contract
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_buf#indefinite-dma-fences

Unforutunately I don't think it's possible to check this at compile time,
thus far all we can do is validate at runtime. I've posted two patches for
this:

https://lore.kernel.org/dri-devel/20201023122216.2373294-17-daniel.vetter@xxxxxxxx/
https://lore.kernel.org/dri-devel/20201023122216.2373294-20-daniel.vetter@xxxxxxxx/

Unfortunately most drivers are buggy and get this completely wrong, so
realistically we'd need to make this a per-driver opt-out and annotate all
current drivers. Well except amdgpu is correct by now I think (they'd
still need to test that).

There is still one potential memory allocation in the run_job callback 
in amdgpu which I wasn't able to fix yet.

But that one is purely academic and could potentially be trivially 
replaced with using GFP_ATOMIC if we ever have to.

Christian.

  And Rob Clark is working on patches to fix up
msm.

I think best here is if you work together with Rob to make sure these
annotations are mandatory for any rust drivers (I don't want new buggy
drivers at least). Would also be great to improve the kerneldoc for all
the driver hooks to explain these restrictions and link to the relevant
kerneldocs (there's also one for the dma_fence signalling annotations
which might be worth linking too).

I don't see any way to make this explicit in rust types, it's really only
something runtime tests (using lockdep) can catch. Somewhat disappointing.

For the other things discussed here:

- Option<Dma_Fence> as the return value for ->prepare_job makes sense to
   me.

- I don't see any way a driver can use ->can_run_job without breaking the
   above rules, that really doesn't sound like a good idea to me.

Cheers, Daniel

---
  drivers/gpu/drm/scheduler/sched_main.c | 10 ++++++++++
  include/drm/gpu_scheduler.h            |  8 ++++++++
  2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 4e6ad6e122bc..5c0add2c7546 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1001,6 +1001,16 @@ static int drm_sched_main(void *param)
  		if (!entity)
  			continue;
  
+		if (sched->ops->can_run_job) {
+			sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
+			if (!sched_job) {
+				complete_all(&entity->entity_idle);
+				continue;
+			}
+			if (!sched->ops->can_run_job(sched_job))
+				continue;
+		}
+
  		sched_job = drm_sched_entity_pop_job(entity);
  
  		if (!sched_job) {
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 9db9e5e504ee..bd89ea9507b9 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -396,6 +396,14 @@ struct drm_sched_backend_ops {
  	struct dma_fence *(*prepare_job)(struct drm_sched_job *sched_job,
  					 struct drm_sched_entity *s_entity);
  
+	/**
+	 * @can_run_job: Called before job execution to check whether the
+	 * hardware is free enough to run the job.  This can be used to
+	 * implement more complex hardware resource policies than the
+	 * hw_submission limit.
+	 */
+	bool (*can_run_job)(struct drm_sched_job *sched_job);
+
  	/**
           * @run_job: Called to execute the job once all of the dependencies
           * have been resolved.  This may be called multiple times, if

--
2.35.1