Re: [RFC 8/8] drm/i915: Expose per-engine client busyness

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 20 Dec 2019 08:07:10 +0000

On 19/12/2019 21:04, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-12-19 18:00:19)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory. Each contains a monotonically increasing nano-second resolution
times each client's jobs were executing on the GPU.

This enables userspace to create a top-like tool for GPU utilization:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

       IMC reads:     1401 MiB/s
      IMC writes:        4 MiB/s

           ENGINE      BUSY                                 MI_SEMA MI_WAIT
      Render/3D/0   63.73% |███████████████████           |      3%      0%
        Blitter/0    9.53% |██▊                           |      6%      0%
          Video/0   39.32% |███████████▊                  |     16%      0%
          Video/1   15.62% |████▋                         |      0%      0%
   VideoEnhance/0    0.00% |                              |      0%      0%

   PID            NAME     RCS          BCS          VCS         VECS
  4084        gem_wsim |█████▌     ||█          ||           ||           |
  4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.
v4: Rebase.
v5: sysfs_attr_init.
v6: Small tidy in i915_gem_add_client.
v7: Rebase to be engine class based.
v8:
  * Always enable stats.
  * Walk all client contexts.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Other than splitting it out into i915_drm_client.c (et al). It
worksforme.

However, it's about as useful as top, but without any means to
kill/stop/reprioritise clients :(

Killing a client is a job for kill(2), no? Since there is pid already, 
it is just a matter of adding some code to intel_gpu_top.

Unless we also want to cover killing work belonging to exited clients. 
Would probably be nice for feature completeness. For that we probably 
want an ioctl. But it would be a first one to allow direct action on 
unrelated clients, even if under CAP_SYS_ADMIN for instance.

To give me actionable data, do we not need more of a perf interface
where events are sent for client start/stop so that observers can
record the context utilisation within their sample periods? I'm thinking
of the "perf stat wsim..." use case where it gives me a breakdown of
each workload.

It is doable I think. I had a prototype at the time when I initially 
started playing with this. In short, what is required is a separate PMU 
"node", and keeping a map of pid to client. Then I was able to query GPU 
time for each pid as profiled by perf. I should have a sketch in a 
branch somewhere.

But IIRC I wasn't sure it was a good replacement for this sysfs 
interface when just thinking about intel_gpu_top. Details escape me now.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx