This patch series enables userspace utilities like gputop and nvtop to query a render context's fdinfo file and figure out rates of engine and memory utilisation. Previous discussion can be found at https://lore.kernel.org/dri-devel/20240920234436.207563-1-adrian.larumbe@xxxxxxxxxxxxx/ Changelog: v8: - Fixed uninitialised stack variable bug that was triggering an invalid memory reference. - Added a few R-b tags to commits v7: - Fixed some kernel test bot-reported documentation and sign mismatch errors. - Defined convenience macros for specifying CS instructions according to their profiled status. - Explicitly initialised instruction count for structure containing a job's instructions when calculating its amount of credits for the scheduler. - Some minor cosmetic nits. v6: - Addressed some nits and style issues. - Enforced static assert equality of instruction buffer when calculating job credits or copying them into the ringbuffer. - Added explanation to the way in which job credits and profiled job size is done. - Broke down fdinfo enablement patch into two, one of them dealing with adding support for calculating the current and top operating device frequencies - Fixed race at the time drm file-wide profiling stats are gathered from groups. v5: - Moved profiling information slots into a per-queue BO and away from syncobjs. - Decide on size of profiling slots BO from size of CS for minimal profiled job - Turn job and device profiling flag into a bit mask so that individual metrics can be enabled separately. - Shrunk ringbuffer slot size to that of a cache line. - Track profiling slot indeces separately from the job's queue ringbuffer's - Emit CS instructions one by one and tag them depending on profiling mask - New helper for calculating job credits depending on profiling flags - Add Documentation file for sysfs profiling knob - fdinfo will only show engines or cycles tags if these are respectively enabled. v4: - Fixed wrong assignment location for frequency values in Panthor's devfreq - Removed the last two commits about registering size of internal BO's - Rearranged patch series so that sysfs knob is done last and all the previous time sampling and fdinfo show dependencies are already in place v3: - Fixed some nits and removed useless bounds check in panthor_sched.c - Added support for sysfs profiling knob and optional job accounting - Added new patches for calculating size of internal BO's v2: - Split original first patch in two, one for FW CS cycle and timestamp calculations and job accounting memory management, and a second one that enables fdinfo. - Moved NUM_INSTRS_PER_SLOT to the file prelude - Removed nelem variable from the group's struct definition. - Precompute size of group's syncobj BO to avoid code duplication. - Some minor nits. Adrián Larumbe (5): drm/panthor: introduce job cycle and timestamp accounting drm/panthor: record current and maximum device clock frequencies drm/panthor: add DRM fdinfo support drm/panthor: enable fdinfo for memory stats drm/panthor: add sysfs knob for enabling job profiling .../testing/sysfs-driver-panthor-profiling | 10 + Documentation/gpu/panthor.rst | 46 +++ drivers/gpu/drm/panthor/panthor_devfreq.c | 18 +- drivers/gpu/drm/panthor/panthor_device.h | 36 ++ drivers/gpu/drm/panthor/panthor_drv.c | 73 ++++ drivers/gpu/drm/panthor/panthor_gem.c | 12 + drivers/gpu/drm/panthor/panthor_sched.c | 384 +++++++++++++++--- drivers/gpu/drm/panthor/panthor_sched.h | 2 + 8 files changed, 531 insertions(+), 50 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-driver-panthor-profiling create mode 100644 Documentation/gpu/panthor.rst -- 2.46.0