On 22/01/2018 18:52, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-01-22 18:43:52)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Per-engine queue depths are an interesting metric for analyzing the system load
and also for users who wish to use it to load balance their submissions based
on it.
In this version I have split the metrics into three separate counters:
1. QUEUED - From execbuf time to request being runnable - runnable meaning until
dependencies have been resolved and fences signaled.
2. RUNNABLE - From runnable to running on the GPU.
3. RUNNING - Running on the GPU.
When inspected with perf stat the output looks roughly like this:
# time counts unit events
201.160490145 0.01 i915/rcs0-queued/
201.160490145 19.13 i915/rcs0-runnable/
201.160490145 2.39 i915/rcs0-running/
The reported numbers are average queue depths for the last query period.
Having split out metrics should be more flexible for all users, and it is still
possible to fetch an atomic snapshot of all using the perf groups for those
wanting to combine them.
For users wanting instantanous numbers instead of averaged, we could potentially
expose them using the query API Lionel is working on.
(https://patchwork.freedesktop.org/series/36622/)
For instance a query packet could look like:
#define DRM_I915_QUERY_ENGINE_QUEUES 0x04
struct drm_i915_query_engine_queues {
__u8 class;
__u8 instance
__u8 pad[2];
__u32 queued;
__u32 runnable;
__u32 running;
};
I also have patches to expose this via intel-gpu-top, using the perf API.
Can you stick a ewma loadavg just after the hostname in intel-gpu-overlay,
pretty please? :)
Sure, just one period and all three counters aggregated?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx