On 19/12/2019 21:04, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-12-19 18:00:19)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Expose per-client and per-engine busyness under the previously added sysfs
client root.
The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.
This enables userspace to create a top-like tool for GPU utilization:
==========================================================================
intel-gpu-top - 935/ 935 MHz; 0% RC6; 14.73 Watts; 1097 irqs/s
IMC reads: 1401 MiB/s
IMC writes: 4 MiB/s
ENGINE BUSY MI_SEMA MI_WAIT
Render/3D/0 63.73% |███████████████████ | 3% 0%
Blitter/0 9.53% |██▊ | 6% 0%
Video/0 39.32% |███████████▊ | 16% 0%
Video/1 15.62% |████▋ | 0% 0%
VideoEnhance/0 0.00% | | 0% 0%
PID NAME RCS BCS VCS VECS
4084 gem_wsim |█████▌ ||█ || || |
4086 gem_wsim |█▌ || ||███ || |
==========================================================================
v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
* Always enable stats.
* Walk all client contexts.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Other than splitting it out into i915_drm_client.c (et al). It
worksforme.
However, it's about as useful as top, but without any means to
kill/stop/reprioritise clients :(
Killing a client is a job for kill(2), no? Since there is pid already,
it is just a matter of adding some code to intel_gpu_top.
Unless we also want to cover killing work belonging to exited clients.
Would probably be nice for feature completeness. For that we probably
want an ioctl. But it would be a first one to allow direct action on
unrelated clients, even if under CAP_SYS_ADMIN for instance.
To give me actionable data, do we not need more of a perf interface
where events are sent for client start/stop so that observers can
record the context utilisation within their sample periods? I'm thinking
of the "perf stat wsim..." use case where it gives me a breakdown of
each workload.
It is doable I think. I had a prototype at the time when I initially
started playing with this. In short, what is required is a separate PMU
"node", and keeping a map of pid to client. Then I was able to query GPU
time for each pid as profiled by perf. I should have a sketch in a
branch somewhere.
But IIRC I wasn't sure it was a good replacement for this sysfs
interface when just thinking about intel_gpu_top. Details escape me now.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx