On Tue, 2025-02-11 at 09:22 +0100, Christian König wrote: > Am 06.02.25 um 17:40 schrieb Tvrtko Ursulin: > > Replace a copy of DRM scheduler's to_drm_sched_job with a copy of a > > newly > > added __drm_sched_entity_queue_pop. > > > > This allows breaking the hidden dependency that queue_node has to > > be the > > first element in struct drm_sched_job. > > > > A comment is also added with a reference to the mailing list > > discussion > > explaining the copied helper will be removed when the whole broken > > amdgpu_job_stop_all_jobs_on_sched is removed. > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> > > Cc: Christian König <christian.koenig@xxxxxxx> > > Cc: Danilo Krummrich <dakr@xxxxxxxxxx> > > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > > Cc: Philipp Stanner <phasta@xxxxxxxxxx> > > Cc: "Zhang, Hawking" <Hawking.Zhang@xxxxxxx> > > Reviewed-by: Christian König <christian.koenig@xxxxxxx> I think this v3 has been supplanted by a v4 by now. @Tvrtko: btw, do you create patches with git format-patch -v4 ? That way the v4 label will be included in all patch titles, too, not just the cover letter. That makes searching etc. easier in large inboxes P. > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 22 +++++++++++++++++++- > > -- > > 1 file changed, 19 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > index 100f04475943..22cb48bab24d 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > @@ -411,8 +411,24 @@ static struct dma_fence *amdgpu_job_run(struct > > drm_sched_job *sched_job) > > return fence; > > } > > > > -#define to_drm_sched_job(sched_job) \ > > - container_of((sched_job), struct drm_sched_job, > > queue_node) > > +/* > > + * This is a duplicate function from DRM scheduler > > sched_internal.h. > > + * Plan is to remove it when amdgpu_job_stop_all_jobs_on_sched is > > removed, due > > + * latter being incorrect and racy. > > + * > > + * See > > https://lore.kernel.org/amd-gfx/44edde63-7181-44fb-a4f7-94e50514f539@xxxxxxx/ > > + */ > > +static struct drm_sched_job * > > +__drm_sched_entity_queue_pop(struct drm_sched_entity *entity) > > +{ > > + struct spsc_node *node; > > + > > + node = spsc_queue_pop(&entity->job_queue); > > + if (!node) > > + return NULL; > > + > > + return container_of(node, struct drm_sched_job, > > queue_node); > > +} > > > > void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler > > *sched) > > { > > @@ -425,7 +441,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct > > drm_gpu_scheduler *sched) > > struct drm_sched_rq *rq = sched->sched_rq[i]; > > spin_lock(&rq->lock); > > list_for_each_entry(s_entity, &rq->entities, list) > > { > > - while ((s_job = > > to_drm_sched_job(spsc_queue_pop(&s_entity->job_queue)))) { > > + while ((s_job = > > __drm_sched_entity_queue_pop(s_entity))) { > > struct drm_sched_fence *s_fence = > > s_job->s_fence; > > > > dma_fence_signal(&s_fence- > > >scheduled); >