On 2017-02-23 02:46 PM, Andres Rodriguez wrote: > > > On 2017-02-23 03:20 AM, Christian König wrote: >> Am 23.02.2017 um 00:47 schrieb Andres Rodriguez: >>> This trace is intended to provide the required information to associate >>> the completion of an amdgpu_job with its corresponding dma_fence_* >>> tracepoints. >>> >>> Signed-off-by: Andres Rodriguez <andresx7 at gmail.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 22 ++++++++++++++++++++++ >>> 2 files changed, 24 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> index 86a1242..84a04e4 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> @@ -177,6 +177,8 @@ static struct dma_fence *amdgpu_job_run(struct >>> amd_sched_job *sched_job) >>> /* if gpu reset, hw fence will be replaced here */ >>> dma_fence_put(job->fence); >>> job->fence = dma_fence_get(fence); >>> + trace_amdgpu_job_attach_fence(job); >>> + >>> amdgpu_job_free_resources(job); >>> return fence; >>> } >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>> index a18ae1e..0a61ed9 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>> @@ -147,6 +147,28 @@ TRACE_EVENT(amdgpu_sched_run_job, >>> __entry->fence, __entry->ring_name, __entry->num_ibs) >>> ); >>> +TRACE_EVENT(amdgpu_job_attach_fence, >>> + TP_PROTO(struct amdgpu_job *job), >>> + TP_ARGS(job), >>> + TP_STRUCT__entry( >>> + __field(struct amdgpu_device *, adev) >>> + __field(struct amd_sched_job *, sched_job) >> >> Neither add adev nor sched_job pointer to your trace point. > Would it be okay then to grab a reference to the scheduler fence in > amdgpu_job_run and print the sched fence pointer here? > > That would give us enough info to correlate other trace points like > amdgpu_cs_ioctl aor amdgpu_sched_run_job to the respective dma_fence > tracepoints. > > >> >> The adev pointer is completely meaningless in the logs and the >> scheduler job might already be freed when the printk is called. Hey Christian, I was just double checking the lifetime of the scheduler job, and it should still be valid during amdgpu_job_run. The lifetime of the job is tied to the scheduler finished fence which won't be signaled until amdgpu_job_run finishes even if the HW fence has signaled. This is assuming that the trace printks are synchronous. So I think printing the job and dropping adev should be okay if the above holds true, unless I'm missing something else. >> >> Just checked the source and a couple of other trace points get this >> horrible wrong as well, so that isn't your fault. Going to put it on >> my todo to fix those. >> >> Regards, >> Christian. >> >>> + __string(timeline, >>> job->fence->ops->get_timeline_name(job->fence)) >>> + __field(unsigned int, context) >>> + __field(unsigned int, seqno) >>> + ), >>> + >>> + TP_fast_assign( >>> + __entry->adev = job->adev; >>> + __entry->sched_job = &job->base; >>> + __assign_str(timeline, >>> job->fence->ops->get_timeline_name(job->fence)) >>> + __entry->context = job->fence->context; >>> + __entry->seqno = job->fence->seqno; >>> + ), >>> + TP_printk("adev=%p, sched_job=%p, timeline:%s, context:%u, >>> seqno:%u", >>> + __entry->adev, __entry->sched_job, >>> + __get_str(timeline), __entry->context, __entry->seqno) >>> +); >>> TRACE_EVENT(amdgpu_vm_grab_id, >>> TP_PROTO(struct amdgpu_vm *vm, int ring, struct amdgpu_job >>> *job), >> >> >