v5 of of https://lore.kernel.org/all/20240515214258.59209-1-lucas.demarchi@xxxxxxxxx Add per-client usage statistics to xe. This ports xe to use the common method in drm to export the usage to userspace per client (where 1 client == 1 drm fd open). However instead of using the current format measured in nsec, this creates a new one. The intention here is not to mix the GPU clock domain with the CPU clock. It allows to cover a few more use cases without extra complications. This version is tested on an ADL-P and also checked gputop with i915 to make sure not regressed. Last patch also contains the documentation for the new key and sample output as requested in v1. The pre-existent drm-cycles-<keystr> is used as is, which allows gputop to work with xe. This last patch still has some open discussion from v2, so we may need to hold it a little more. v2: - Create a new drm-total-cycles instead of re-using drm-engine with a different unit - Add documentation for the new interface and clarify usage of xe_lrc_update_timestamp() v3: - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" - see commit message - Reorder commits so the ones that are useful in other patch series come first v4: - Fix some comments and documentation - Add 2 patches so we cache on the gt the mask of engines visible to userspace and the per-class capacity. Previously we were doing this during the query, but besides not being very efficient as pointed by Tvrtko, we were also not handling correclty reserved engines and engines "hidden" by e.g. ccs_mode. So move that part to be executed on init and when changing the available engines. - Simplify the fdinfo output loop since now we have the information cached on gt. This also moves the read of the gpu timestamp out of the loop as suggested by Tvrtko and using the helpers implemented in the new patches. v5: - Fix kernel-doc - Move pm_runtime_put() earlier in the function as it's not needed anymore after interacting with the HW. Lucas De Marchi (6): drm/xe: Promote xe_hw_engine_class_to_str() drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion drm/xe: Add helper to capture engine timestamp drm/xe: Cache data about user-visible engines drm/xe: Add helper to return any available hw engine drm/xe/client: Print runtime to fdinfo Umesh Nerlige Ramappa (2): drm/xe/lrc: Add helper to capture context timestamp drm/xe: Add helper to accumulate exec queue runtime Documentation/gpu/drm-usage-stats.rst | 21 ++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_device_types.h | 3 + drivers/gpu/drm/xe/xe_drm_client.c | 121 +++++++++++++++++- drivers/gpu/drm/xe/xe_exec_queue.c | 37 ++++++ drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_gt.c | 34 +++++ drivers/gpu/drm/xe/xe_gt.h | 20 +++ drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 1 + drivers/gpu/drm/xe/xe_gt_types.h | 21 ++- drivers/gpu/drm/xe/xe_guc_submit.c | 2 + drivers/gpu/drm/xe/xe_hw_engine.c | 27 ++++ drivers/gpu/drm/xe/xe_hw_engine.h | 3 + drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --- drivers/gpu/drm/xe/xe_lrc.c | 12 ++ drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 + 20 files changed, 329 insertions(+), 22 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst -- 2.43.0