On 30/03/2023 19:31, Umesh Nerlige Ramappa wrote:
+ Joonas for comments on this
On Thu, Mar 30, 2023 at 02:38:03PM +0100, Tvrtko Ursulin wrote:
On 30/03/2023 01:41, Umesh Nerlige Ramappa wrote:
MTL introduces separate GTs for render and media. This complicates the
definition of frequency and rc6 counters for the GPU as a whole since
each GT has an independent counter. The best way to support this change
is to deprecate the GPU-specific counters and create GT-specific
counters, however that just breaks ABI. Since perf tools and scripts may
be decentralized with probably many users, it's hard to deprecate the
legacy counters and have all the users on board with that.
Re-introduce the legacy counters and support them as min/max of
GT-specific counters as necessary to ensure backwards compatibility.
I915_PMU_ACTUAL_FREQUENCY - will show max of GT-specific counters
I915_PMU_REQUESTED_FREQUENCY - will show max of GT-specific counters
I915_PMU_INTERRUPTS - no changes since it is GPU specific on all
platforms
I915_PMU_RC6_RESIDENCY - will show min of GT-specific counters
I915_PMU_SOFTWARE_GT_AWAKE_TIME - will show max of GT-specific counters
IMO max/min games are _very_ low value and probably just confusing.
By value, do you mean ROI or actually that the values would be incorrect?
Both really.
I am not convinced we need to burden the kernel with this. New
platform, new counters.. userspace can just deal with it.
I agree and would prefer to drop this patch. There are some counter
arguments, I have added Joonas here for comments.
1) an app/script hard-coded with the legacy events would be used on a
new platform and fail and we should maintain backwards compatibility.
I thought we pretty much agreed multiple times in the past (on different
topics) that a new platform can require new userspace.
PMU is probably even a more clear cut case since it is exposing hardware
counters (or close) so sometimes it is not even theoretically possible
to preserve "backward" compatibility.
(I double quote backward because I think real backward compatibility
does not apply on a new platform. And MTL is under force probe still.)
So for me it all comes under the "would be nice" category. But since we
need to add kernel code to do it, code which asy intel_gpu_top could run
in userspace, I am not at all convinced it wouldn't be a bad idea.
The aggregated counters wouldn't even be giving the full picture.
So I'd simply add tiles/gt support to intel_gpu_top. Same as it
currently can do "-p" on the command line, or '1' in the interactive
mode, to aggregate the engine classes into one line item, I'd extend
that concept into frequencies and RC6.
By default we start with normalized values and in physical mode we show
separate counters per tile/gt.
Someone running old intel_gpu_top on MTL gets to see nothing since the
counter names are different. Which is IMO fine - better than showing
tile 0 data, or some minimums/maximums from one tile only.
2) the sysfs attributes for rc6/frequency have already adopted an
aggregate vs gt0/gt1 approach to address that and pmu should have a
similar solution (or rather, PMU and the sysfs approaches should match
based on whatever is the approach)
Yeah I disagreed with min/max reads in sysfs too and am pretty sure I
expressed that at the time. :shrug:
But I don't think there is a strong argument that PMU needs to follow.
Only impact is to people who access perf_event_open directly so yeah, if
there are such users, they will need to add multi-tile support.
Regards,
Tvrtko