Am 24.01.24 um 22:08 schrieb Matthew Brost:
All entities must be drained in the DRM scheduler run job worker to
avoid the following case. An entity found that is ready, no job found
ready on entity, and run job worker goes idle with other entities + jobs
ready. Draining all ready entities (i.e. loop over all ready entities)
in the run job worker ensures all job that are ready will be scheduled.
That doesn't make sense. drm_sched_select_entity() only returns entities
which are "ready", e.g. have a job to run.
If that's not the case any more then you have broken something else.
Regards,
Christian.
Cc: Thorsten Leemhuis <regressions@xxxxxxxxxxxxx>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>
Closes: https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuOWtoeeE+q26zE+Q@xxxxxxxxxxxxxx/
Reported-and-tested-by: Mario Limonciello <mario.limonciello@xxxxxxx>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
Link: https://lore.kernel.org/all/20240123021155.2775-1-mario.limonciello@xxxxxxx/
Reported-by: Vlastimil Babka <vbabka@xxxxxxx>
Closes: https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd159a8@xxxxxxx/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
Signed-off-by: Matthew Brost <matthew.brost@xxxxxxxxx>
---
drivers/gpu/drm/scheduler/sched_main.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 550492a7a031..85f082396d42 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1178,21 +1178,20 @@ static void drm_sched_run_job_work(struct work_struct *w)
struct drm_sched_entity *entity;
struct dma_fence *fence;
struct drm_sched_fence *s_fence;
- struct drm_sched_job *sched_job;
+ struct drm_sched_job *sched_job = NULL;
int r;
if (READ_ONCE(sched->pause_submit))
return;
- entity = drm_sched_select_entity(sched);
+ /* Find entity with a ready job */
+ while (!sched_job && (entity = drm_sched_select_entity(sched))) {
+ sched_job = drm_sched_entity_pop_job(entity);
+ if (!sched_job)
+ complete_all(&entity->entity_idle);
+ }
if (!entity)
- return;
-
- sched_job = drm_sched_entity_pop_job(entity);
- if (!sched_job) {
- complete_all(&entity->entity_idle);
return; /* No more work */
- }
s_fence = sched_job->s_fence;