Quoting Michael Sartain (2019-01-29 01:52:12) > On Mon, Jan 21, 2019, at 4:20 PM, Chris Wilson wrote: > > Rather than every backend and GPU driver reinventing the same wheel for > > user level debugging of HW execution, the common dma-fence framework > > should include the tracing infrastructure required for most client API > > level flow visualisation. > > > > With these common dma-fence level tracepoints, the userspace tools can > > establish a detailed view of the client <-> HW flow across different > > kernels. There is a strong ask to have this available, so that the > > userspace developer can effectively assess if they're doing a good job > > about feeding the beast of a GPU hardware. > ... > > I've got a first pass of this visualizing with gpuvis. Screenshots: > > ; with dma_event tracepoints patch > https://imgur.com/a/MwvoAYY > > ; with old i915 tracepoints > https://imgur.com/a/tG2iyHS > > Couple questions... > > With your new dma_event traceponts patch, we're still getting these > tracepoints: > > i915_request_in > i915_request_out These are debugging not really tracepoints and should be covered by trace_printk already. Left in this patch as they are a slightly different argument to remove (as in they are not directly replaced by dma-fence tracing). > intel_engine_notify To be removed upstream very shortly. > And the in/out tracepoints line up with dma_fence_executes > (same ctx:seqno and time): > > <idle>-0 [006] 150.376273: dma_fence_execute_start: context=31, seqno=35670, hwid=0 > <idle>-0 [006] 150.413215: dma_fence_execute_end: context=31, seqno=35670, hwid=0 > > <idle>-0 [006] 150.376272: i915_request_in: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35670, prio=0, global=41230, port=1 > <idle>-0 [006] 150.413217: i915_request_out: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35670, global=41230, completed?=1 > > However I'm also seeing several i915_request_in --> intel_engine_notify > tracepoints that don't have dma_fence_execute_* tracepoints: Yes. I was trying to wean the API off expecting having an exact match and just be happy with context in/out events, not request level details. > RenderThread-1279 [001] 150.341336: dma_fence_init: driver=i915 timeline=ShooterGame[1226]/2 context=31 seqno=35669 > RenderThread-1279 [001] 150.341352: dma_fence_emit: context=31, seqno=35669 > <idle>-0 [006] 150.376271: i915_request_in: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35669, prio=0, global=41229, port=1 > <idle>-0 [006] 150.411525: intel_engine_notify: dev=0, engine=0:0, seqno=41229, waiters=1 > RenderThread-1279 [001] 150.419779: dma_fence_signaled: context=31, seqno=35669 > RenderThread-1279 [001] 150.419838: dma_fence_destroy: context=31, seqno=35669 > > I assume something is going on at a lower level that we can't get the > information for via dma_fence? Deliberate obfuscation. It more or less lets us know what client was running on the GPU at any one time, but you have to work back to identify exactly what fence by inspecting the signaling timeline. -Chris _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel