Hi,
I've now tested v5 with old Ubuntu kernel on KBL, and with latest
drm-tip kernel on SNB, HSW, BYT, BSW and BDW GT & GT3.
Generic test results
--------------------
* Tool works on all of them
* The new error messages and headings look good
* Idle IMC read amounts correspond to expected values on SNB & HSW.
The much smaller values on BDW & SKL are due to FBC (how well
it compresses, naturally depends on screen content).
BYT & BSW
---------
* IMC, power usage and actual(?) freq values are missing.
-> You can get actual freq by polling CAGF register, represented by:
/sys/class/drm/card0/gt_act_freq_mhz
Normally i915 driver maps uncore power usage to GPU power usage,
but BYT is missing that (and ram power usage). However, RAPL
does report package & core values...
Suggestions
-----------
Maybe on platforms where RAPL doesn't report "uncore" power usage,
you could just deduct RAPL reported "core" power consumption from
the "package" power consumption, and report that as "GPU" power
usage? (Or do that in i915 directly)
You need also to either update the manual, or implement -o and -e
options for the new version of intel_gpu_top. CSV output of all
the reported values would be nice.
You might mention in manual as an example how to calculate
idle screen update bandwidth, and that it's impacted by:
- PSR (panel self refresh, depends on display supporting it):
/sys/kernel/debug/dri/0/i915_edp_psr_status
- FBC (frame buffer compression, enabled on newer GENs)
/sys/kernel/debug/dri/0/i915_fbc_status
- end-to-end RBC (render buffer compression, requires modifiers
support i.e. GEN9+ GPU and X & Mesa with DRI3 v1.2 [1] support)
- Eero
[1] Requires building latest git versions of Mesa, libxcb, X server
and few other things, and adding this to X server config:
-------------------------------
Section "ServerFlags"
Option "Debug" "dmabuf_capable"
EndSection
-------------------------------
On 03.04.2018 20:18, Tvrtko Ursulin wrote:
On 03/04/2018 15:06, Eero Tamminen wrote:
On 03.04.2018 12:36, Tvrtko Ursulin wrote:
On 29/03/2018 15:30, Eero Tamminen wrote:
[...]
Old tool showed also GPU system memory interface (GAM) busyness.
That was valuable info, and reasonably accurate for stable loads.
Could this tool show also either that information (preferred), or
bandwidth utilized by GPU/CPU/display?
(Latest kernels offer GPU memory bandwidth usage through perf
"uncore_imc" "data_reads" & "date_writes" counters.)
Excellent suggestion and I've added IMC data_reads and data_writes to
the tool.
Thanks, it looks fine too. I'm just wondering about the numbers
it's reporting on SKL GT2...
AFAIK IMC counters are for uncore, so I though that they should
correspond to GTI (memory interface to outside of GPU) read and
write HW counter values. While it seemed in some cases quite close,
in some cases the it showed a lot smaller (2/3) value than expected.
I can understand why reads are sometimes larger, because I think
uncore will include also display engine display content reads.
However, I don't see how uncore writes could be considerably smaller
than the GTI interface write amount.
(GTI interface reports the expected value which corresponds directly
to what my test application is doing (64x blended FullHD layer writes).)
Idle machine read amounts are also much smaller (60-65MB/s) than what
I think display update read should be (1920*1080*4*60Hz = 475MiB/s).
Any ideas for these two discrepancies?
I'm afraid I am not familiar with the uncore IMC, but we could always
approach its authors?
Is "wait" value supposed to be IO-wait for given engine interface?
I never saw that change from 0%, although IO-wait in top jumped
from 0 to 20-30% with my test GPU load.
No, that is time spent in MI_WAIT_FOR_EVENT.
Could you add that info to the UI?
E.g. just have "MI" on top of the "wait" column.
Like a full header strip? Yeah makes sense, I'll add it.
> I think not very used in current codebase.
What you're using to validate that it reports correct value?
That would be igt/tests/perf_pmu/event-wait-rcs0.
HW specific test results
------------------------
BYT:
* Reports "Failed to initialize PMU!" although old intel_gpu_top
works fine.
HSW GT2, BDW GT3, SKL GT2 and KBL GT3e seems to work fine except
for the "wait" value.
I never saw blitter engine to do anything, but that's because
modesetting uses just 3D pipeline, and because I couldn't get
Intel DDX to work with rest of latest git version of X / 3D stack.
Thank you for testing this so thoroughly - this was really invaluable
since I don't have access too such number of platforms. I've tried to
fix all this in the latest version.
Machines are currently running tests, I'll check these tomorrow.
Thanks!
Kernel version support
----------------------
My HW specific testing above was with drm-tip kernel, but I did one
test
also with Ubuntu 16.04 v4.4 kernel (which includes v4.6 or v4.8 i915
backport) on KBL. For that, the tool reported:
"Failed to detect engines!"
Although the previous intel_gpu_top works fine with that kernel
version.
Same happens also with Ubuntu 17.04 v4.13 kernel.
-> If new version needs a certain kernel version, it should tell
which version is required.
Yep, at least 4.16 is needed so I have added this info to the error
message.
IMHO the message is a bit ambivalent:
Failed to detect engines! Kernel 4.16 or newer?
I would suggest checking whether kernel is new enough, and if not:
Kernel X.YY detected, 4.16 or newer required.
Maybe yeah. I was planning to improve error messages altogether but
forgot. Will see what improvements make sense.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx