On 2017-10-20 09:32 AM, Andrey Grodzovsky wrote: > Bug: amdgpu_job_free_cb was accessing s_job->s_entity when the allocated > amdgpu_ctx (and the entity inside it) were already deallocated from > amdgpu_cs_parser_fini. > > Fix: Save job's priority on it's creation instead of accessing it from > s_entity later on. I'm not sure if this is the correct approach for a fix. Keeping s_entity as a dangling pointer could result in a similar bugs being reintroduced. For example, there would still be a race condition between amdgpu_cs_parser_fini() and amdgpu_job_dependency(). Instead, it might be better for the job to grab a reference to the context during job_init(), and put that reference on job free. Regards, Andres > > Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +-- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 ++--- > drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 + > drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 32 ++++++++++++--------------- > 4 files changed, 18 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index f7fceb6..a760b6e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1192,8 +1192,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, > job->uf_sequence = seq; > > amdgpu_job_free_resources(job); > - amdgpu_ring_priority_get(job->ring, > - amd_sched_get_job_priority(&job->base)); > + amdgpu_ring_priority_get(job->ring, job->base.s_priority); > > trace_amdgpu_cs_ioctl(job); > amd_sched_entity_push_job(&job->base); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > index 0cfc68d..a58e3c5 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > @@ -104,7 +104,7 @@ static void amdgpu_job_free_cb(struct amd_sched_job *s_job) > { > struct amdgpu_job *job = container_of(s_job, struct amdgpu_job, base); > > - amdgpu_ring_priority_put(job->ring, amd_sched_get_job_priority(s_job)); > + amdgpu_ring_priority_put(job->ring, s_job->s_priority); > dma_fence_put(job->fence); > amdgpu_sync_free(&job->sync); > amdgpu_sync_free(&job->dep_sync); > @@ -141,8 +141,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring, > job->fence_ctx = entity->fence_context; > *f = dma_fence_get(&job->base.s_fence->finished); > amdgpu_job_free_resources(job); > - amdgpu_ring_priority_get(job->ring, > - amd_sched_get_job_priority(&job->base)); > + amdgpu_ring_priority_get(job->ring, job->base.s_priority); > amd_sched_entity_push_job(&job->base); > > return 0; > diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c > index e4d3b4e..1bbbce2 100644 > --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c > +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c > @@ -529,6 +529,7 @@ int amd_sched_job_init(struct amd_sched_job *job, > { > job->sched = sched; > job->s_entity = entity; > + job->s_priority = entity->rq - sched->sched_rq; > job->s_fence = amd_sched_fence_create(entity, owner); > if (!job->s_fence) > return -ENOMEM; > diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h > index 52c8e54..3f75b45 100644 > --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h > +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h > @@ -30,6 +30,19 @@ > struct amd_gpu_scheduler; > struct amd_sched_rq; > > +enum amd_sched_priority { > + AMD_SCHED_PRIORITY_MIN, > + AMD_SCHED_PRIORITY_LOW = AMD_SCHED_PRIORITY_MIN, > + AMD_SCHED_PRIORITY_NORMAL, > + AMD_SCHED_PRIORITY_HIGH_SW, > + AMD_SCHED_PRIORITY_HIGH_HW, > + AMD_SCHED_PRIORITY_KERNEL, > + AMD_SCHED_PRIORITY_MAX, > + AMD_SCHED_PRIORITY_INVALID = -1, > + AMD_SCHED_PRIORITY_UNSET = -2 > +}; > + > + > /** > * A scheduler entity is a wrapper around a job queue or a group > * of other entities. Entities take turns emitting jobs from their > @@ -83,6 +96,7 @@ struct amd_sched_job { > struct delayed_work work_tdr; > uint64_t id; > atomic_t karma; > + enum amd_sched_priority s_priority; > }; > > extern const struct dma_fence_ops amd_sched_fence_ops_scheduled; > @@ -114,18 +128,6 @@ struct amd_sched_backend_ops { > void (*free_job)(struct amd_sched_job *sched_job); > }; > > -enum amd_sched_priority { > - AMD_SCHED_PRIORITY_MIN, > - AMD_SCHED_PRIORITY_LOW = AMD_SCHED_PRIORITY_MIN, > - AMD_SCHED_PRIORITY_NORMAL, > - AMD_SCHED_PRIORITY_HIGH_SW, > - AMD_SCHED_PRIORITY_HIGH_HW, > - AMD_SCHED_PRIORITY_KERNEL, > - AMD_SCHED_PRIORITY_MAX, > - AMD_SCHED_PRIORITY_INVALID = -1, > - AMD_SCHED_PRIORITY_UNSET = -2 > -}; > - > /** > * One scheduler is implemented for each hardware ring > */ > @@ -176,10 +178,4 @@ bool amd_sched_dependency_optimized(struct dma_fence* fence, > struct amd_sched_entity *entity); > void amd_sched_job_kickout(struct amd_sched_job *s_job); > > -static inline enum amd_sched_priority > -amd_sched_get_job_priority(struct amd_sched_job *job) > -{ > - return (job->s_entity->rq - job->sched->sched_rq); > -} > - > #endif >