On Tue, Jun 7, 2022 at 1:56 AM Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > > > On 06/06/2022 20:54, Rob Clark wrote: > > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > > > Similar to AMD commit > > 874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the > > infrastructure added in previous patches, we add basic client info > > and GPU engine utilisation for msm. > > > > Example output: > > > > # cat /proc/`pgrep glmark2`/fdinfo/6 > > pos: 0 > > flags: 02400002 > > mnt_id: 21 > > ino: 162 > > drm-driver: msm > > drm-client-id: 7 > > drm-engine-gpu: 1734371319 ns > > drm-cycles-gpu: 1153645024 > > drm-maxfreq-gpu: 800000000 Hz > > > > See also: https://patchwork.freedesktop.org/patch/468505/ > > > > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> > > --- > > Documentation/gpu/drm-usage-stats.rst | 21 +++++++++++++++++++++ > > drivers/gpu/drm/msm/msm_drv.c | 19 ++++++++++++++++++- > > drivers/gpu/drm/msm/msm_gpu.c | 21 +++++++++++++++++++-- > > drivers/gpu/drm/msm/msm_gpu.h | 19 +++++++++++++++++++ > > 4 files changed, 77 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst > > index 6c9f166a8d6f..60e5cc9c13ad 100644 > > --- a/Documentation/gpu/drm-usage-stats.rst > > +++ b/Documentation/gpu/drm-usage-stats.rst > > @@ -105,6 +105,27 @@ object belong to this client, in the respective memory region. > > Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB' > > indicating kibi- or mebi-bytes. > > > > +- drm-cycles-<str> <uint> > > + > > +Engine identifier string must be the same as the one specified in the > > +drm-engine-<str> tag and shall contain the number of busy cycles for the given > > +engine. > > + > > +Values are not required to be constantly monotonic if it makes the driver > > +implementation easier, but are required to catch up with the previously reported > > +larger value within a reasonable period. Upon observing a value lower than what > > +was previously read, userspace is expected to stay with that larger previous > > +value until a monotonic update is seen. > > + > > +- drm-maxfreq-<str> <uint> [Hz|MHz|KHz] > > + > > +Engine identifier string must be the same as the one specified in the > > +drm-engine-<str> tag and shall contain the maxium frequence for the given > > maximum frequency > > > +engine. Taken together with drm-cycles-<str>, this can be used to calculate > > +percentage utilization of the engine, whereas drm-engine-<str> only refects > > reflects > > > +time active without considering what frequency the engine is operating as a > > +percentage of it's maximum frequency. > > Cycles vs max freq sounds very useful. My reservations is that how come > the idea hasn't happened in the CPU world. Or maybe it has and I am > un-informed? I do often pay attention to both where tasks get scheduled, and the individual CPU freq when I'm profiling CPU side stuff (eg. in perfetto) I could also report "always-count" cycles, I think, which could be used by gputop to derive freq. I'd have to think about that a bit, since keeping the result monotinic(ish) might be a bit tricky (the hw counter loses state across runtime suspend) > In any case, if going with this I think we need to clarify the text that > the value should reflect the current soft limit, where the driver > supports that, in case it has been set to lower than the maximum > frequency hardware can support. I am thinking about avoiding "my gpu > cannot hit 100%" support incidents in cases when user/admin lowered the > soft limit for some reason. Possibly does not apply to msm but can apply > to i915, if we decided to export the same data. Yes, with pm-qos thermal or userspace could limit the max freq.. but we also internally use a pm-qos constraint to reduce freq when the GPU is idle, and I don't think there is a good way to differentiate *which* constraint is which. I'll add something involving the word "recommended" ;-) BR, -R > > No other gotchas come to mind at the moment. > > Regards, > > Tvrtko > > > + > > =============================== > > Driver specific implementations > > =============================== > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > > index 14ab9a627d8b..57a66093e671 100644 > > --- a/drivers/gpu/drm/msm/msm_drv.c > > +++ b/drivers/gpu/drm/msm/msm_drv.c > > @@ -948,7 +948,24 @@ static const struct drm_ioctl_desc msm_ioctls[] = { > > DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, DRM_RENDER_ALLOW), > > }; > > > > -DEFINE_DRM_GEM_FOPS(fops); > > +static void msm_fop_show_fdinfo(struct seq_file *m, struct file *f) > > +{ > > + struct drm_file *file = f->private_data; > > + struct drm_device *dev = file->minor->dev; > > + struct msm_drm_private *priv = dev->dev_private; > > + struct drm_printer p = drm_seq_file_printer(m); > > + > > + if (!priv->gpu) > > + return; > > + > > + msm_gpu_show_fdinfo(priv->gpu, file->driver_priv, &p); > > +} > > + > > +static const struct file_operations fops = { > > + .owner = THIS_MODULE, > > + DRM_GEM_FOPS, > > + .show_fdinfo = msm_fop_show_fdinfo, > > +}; > > > > static const struct drm_driver msm_driver = { > > .driver_features = DRIVER_GEM | > > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c > > index eb8a6663f309..333a9a299b41 100644 > > --- a/drivers/gpu/drm/msm/msm_gpu.c > > +++ b/drivers/gpu/drm/msm/msm_gpu.c > > @@ -4,6 +4,8 @@ > > * Author: Rob Clark <robdclark@xxxxxxxxx> > > */ > > > > +#include "drm/drm_drv.h" > > + > > #include "msm_gpu.h" > > #include "msm_gem.h" > > #include "msm_mmu.h" > > @@ -146,6 +148,16 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu) > > return 0; > > } > > > > +void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx, > > + struct drm_printer *p) > > +{ > > + drm_printf(p, "drm-driver:\t%s\n", gpu->dev->driver->name); > > + drm_printf(p, "drm-client-id:\t%u\n", ctx->seqno); > > + drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns); > > + drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles); > > + drm_printf(p, "drm-maxfreq-gpu:\t%lu Hz\n", gpu->fast_rate); > > +} > > + > > int msm_gpu_hw_init(struct msm_gpu *gpu) > > { > > int ret; > > @@ -652,7 +664,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, > > { > > int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; > > volatile struct msm_gpu_submit_stats *stats; > > - u64 elapsed, clock = 0; > > + u64 elapsed, clock = 0, cycles; > > unsigned long flags; > > > > stats = &ring->memptrs->stats[index]; > > @@ -660,12 +672,17 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, > > elapsed = (stats->alwayson_end - stats->alwayson_start) * 10000; > > do_div(elapsed, 192); > > > > + cycles = stats->cpcycles_end - stats->cpcycles_start; > > + > > /* Calculate the clock frequency from the number of CP cycles */ > > if (elapsed) { > > - clock = (stats->cpcycles_end - stats->cpcycles_start) * 1000; > > + clock = cycles * 1000; > > do_div(clock, elapsed); > > } > > > > + submit->queue->ctx->elapsed_ns += elapsed; > > + submit->queue->ctx->cycles += cycles; > > + > > trace_msm_gpu_submit_retired(submit, elapsed, clock, > > stats->alwayson_start, stats->alwayson_end); > > > > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h > > index 6def00883046..4911943ba53b 100644 > > --- a/drivers/gpu/drm/msm/msm_gpu.h > > +++ b/drivers/gpu/drm/msm/msm_gpu.h > > @@ -361,6 +361,22 @@ struct msm_file_private { > > /** cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE */ > > char *cmdline; > > > > + /** > > + * elapsed: > > + * > > + * The total (cumulative) elapsed time GPU was busy with rendering > > + * from this context in ns. > > + */ > > + uint64_t elapsed_ns; > > + > > + /** > > + * cycles: > > + * > > + * The total (cumulative) GPU cycles elapsed attributed to this > > + * context. > > + */ > > + uint64_t cycles; > > + > > /** > > * entities: > > * > > @@ -544,6 +560,9 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val) > > int msm_gpu_pm_suspend(struct msm_gpu *gpu); > > int msm_gpu_pm_resume(struct msm_gpu *gpu); > > > > +void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx, > > + struct drm_printer *p); > > + > > int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx); > > struct msm_gpu_submitqueue *msm_submitqueue_get(struct msm_file_private *ctx, > > u32 id);