On 6/21/22 21:40, Gordon Messmer wrote:
On 6/21/22 13:10, Matthew Miller wrote:
Phoronix credits this to those distros shipping with P-state
Performance by default.
Yes, but I doubt that for several reasons: First, it's a claim without
evidence. That setting isn't the only difference between any two
systems tested. Second, the claim doesn't make any *sense*. Systems
with intel_pstate balanced aren't supposed to be noticeably slower for
sustained CPU intensive workloads. The intel_pstate driver is supposed
to scale the frequency up under load in the "balanced" configuration,
delivering performance when it is needed and power saving when it
isn't. Third, I can run their tests on my own system in an intel_pstate
performance mode and an intel_pstate balanced mode, and the test results
are nearly identical, which is the expected outcome.
I did some work this week to see if I could learn anything from
Phoronix's article [1], and came up pretty much dry. I cannot replicate
any of the differences that I would expect to be able to. More than
anything else, their results look like evidence of a bug in the Xeon
Platinum 8380.
In retrospect, the first thing that should have stood out to me when I
looked at this ~ 3 weeks ago (but which I missed) was that if I pull the
phoronix/pts container image and "run pts/compress-zstd" with
"Compression Level: 3, Long Mode", I get better results on my XPS 13
laptop than they did on their Xeon. And, while cpubenchmark.net does
suggest that my i5 CPU [2] has a better single-core test results than
the Xeon [3], the zstd test should not be limited to a single core. On
my laptop, top reported the zstd process typically using ~400% CPU time.
The first thing I tried to reproduce was a difference between
"performance" and "powersave" settings in the intel_pstate cpufreq
driver. I used the zstd compression test on my only Intel CPU, which is
in my XPS 13 laptop. In the default Fedora WS configuration,
scaling_driver is intel_pstate, scaling_governor is powersave, and (I
believe) energy_performance_preference is balance_performance. In that
configuration, typical values for scaling_cur_freq were significantly
lower than typical values after changing energy_performance_preference
to performance, and scaling_governor to performance. So on this laptop,
I'm confident that the governor and EPP settings are behaving as
expected. But zstd benchmark results are essentially indistinguishable
when running in one mode vs the other, because the powersave mode for
the intel_pstate driver will scale CPU speed up on demand.
In addition, phoronix has benchmarked Intel systems in the past [4] to
determine the effective difference between the intel_pstate powersave
and performance modes, and found minimal differences on an Intel i9 CPU.
I also tested the svt-av1 benchmark on this system in both modes, as
this was another CPU bound test that Phoronix reported as a significant
difference and attributed to the P-State governor setting. Again, I saw
no significant difference between performance and powersave results.
All of this suggests that the Xeon was simply not scaling up for these
tests. Given its large number of cores, perhaps the benchmarks weren't
putting enough load on the system to trigger scaling up. Or (as a
matter of *wild* speculation) maybe it was scaling up some cores, but
Linux was shuffling tasks between cores and "missing" the fast ones.
Whatever the case, the big differences between distributions reported by
Phoronix are probably limited to this class of CPUs.
If this is *normal* behavior for those CPUs, then maybe the Fedora
Server group would want to change the default governor, or emphasize the
importance of the CPU governor selection in their documentation.
I also ran benchmarks on CentOS Stream 9 and Fedora Server 36, each
installed in a VM under CentOS Stream 9 libvirt, running on a host with
a AMD Ryzen 5 CPU [5], with the CPU configuration copied to the guests.
As VMs, these would not apply any cpufreq management of their own, and
if there were any differences resulting from the CPU architecture
target, they should be apparent in these tests. Test results for these
VMs 20-40% better than the Xeon's best results, but results under the
CentOS Stream 9 VM were essentially the same as results under the Fedora
Server 36 VM. It's probably still interesting to run the full suite and
see if any other tests do have significant differences, and I'll try to
do that later.
I think that's enough to convince me that I was wrong to doubt that the
intel_pstate configuration was the reason that these results differed,
although I still believe that if that is the case, then the CPU's
internal pstate selection is broken.
1: https://www.phoronix.com/scan.php?page=article&item=h1-2022-linux&num=1
2:
https://www.cpubenchmark.net/cpu.php?cpu=Intel+Core+i5-1135G7+%40+2.40GHz&id=3830
3:
https://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+Platinum+8380+%40+2.30GHz&id=4483
4:
https://www.phoronix.com/scan.php?page=article&item=linux50-pstate-cpufreq&num=1
5: https://www.cpubenchmark.net/cpu.php?cpu=AMD+Ryzen+5+5600X&id=3859
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure