On Tue, 24 Sep 2024 00:06:20 +0100 Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx> wrote: > This patch series enables userspace utilities like gputop and nvtop to query a > render context's fdinfo file and figure out rates of engine and memory utilisation. > > Previous discussion can be found at > https://lore.kernel.org/dri-devel/20240920234436.207563-1-adrian.larumbe@xxxxxxxxxxxxx/ > > Changelog: > v8: > - Fixed uninitialised stack variable bug that was triggering an invalid memory reference. > - Added a few R-b tags to commits > v7: > - Fixed some kernel test bot-reported documentation and sign mismatch errors. > - Defined convenience macros for specifying CS instructions according to their profiled status. > - Explicitly initialised instruction count for structure containing a job's > instructions when calculating its amount of credits for the scheduler. > - Some minor cosmetic nits. > v6: > - Addressed some nits and style issues. > - Enforced static assert equality of instruction buffer when calculating job > credits or copying them into the ringbuffer. > - Added explanation to the way in which job credits and profiled job size is done. > - Broke down fdinfo enablement patch into two, one of them dealing with adding > support for calculating the current and top operating device frequencies > - Fixed race at the time drm file-wide profiling stats are gathered from groups. > v5: > - Moved profiling information slots into a per-queue BO and away from syncobjs. > - Decide on size of profiling slots BO from size of CS for minimal profiled job > - Turn job and device profiling flag into a bit mask so that individual metrics > can be enabled separately. > - Shrunk ringbuffer slot size to that of a cache line. > - Track profiling slot indeces separately from the job's queue ringbuffer's > - Emit CS instructions one by one and tag them depending on profiling mask > - New helper for calculating job credits depending on profiling flags > - Add Documentation file for sysfs profiling knob > - fdinfo will only show engines or cycles tags if these are respectively enabled. > v4: > - Fixed wrong assignment location for frequency values in Panthor's devfreq > - Removed the last two commits about registering size of internal BO's > - Rearranged patch series so that sysfs knob is done last and all the previous > time sampling and fdinfo show dependencies are already in place > v3: > - Fixed some nits and removed useless bounds check in panthor_sched.c > - Added support for sysfs profiling knob and optional job accounting > - Added new patches for calculating size of internal BO's > v2: > - Split original first patch in two, one for FW CS cycle and timestamp > calculations and job accounting memory management, and a second one > that enables fdinfo. > - Moved NUM_INSTRS_PER_SLOT to the file prelude > - Removed nelem variable from the group's struct definition. > - Precompute size of group's syncobj BO to avoid code duplication. > - Some minor nits. > > Adrián Larumbe (5): > drm/panthor: introduce job cycle and timestamp accounting > drm/panthor: record current and maximum device clock frequencies > drm/panthor: add DRM fdinfo support > drm/panthor: enable fdinfo for memory stats > drm/panthor: add sysfs knob for enabling job profiling Queued to drm-misc-next after applying the few modifications I mentioned. Also added Steve's ack (given on IRC) on the first patch. > > .../testing/sysfs-driver-panthor-profiling | 10 + > Documentation/gpu/panthor.rst | 46 +++ > drivers/gpu/drm/panthor/panthor_devfreq.c | 18 +- > drivers/gpu/drm/panthor/panthor_device.h | 36 ++ > drivers/gpu/drm/panthor/panthor_drv.c | 73 ++++ > drivers/gpu/drm/panthor/panthor_gem.c | 12 + > drivers/gpu/drm/panthor/panthor_sched.c | 384 +++++++++++++++--- > drivers/gpu/drm/panthor/panthor_sched.h | 2 + > 8 files changed, 531 insertions(+), 50 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-driver-panthor-profiling > create mode 100644 Documentation/gpu/panthor.rst >