On 10/20/23 13:23, Richard W.M. Jones wrote:
Today I've read (twice) that the overhead of frame pointers on the
runtime of the compiler, GCC, is 10%. This number is nonsense. The
actual overhead is 1%, and I have done the tests that show this.
Both the 1% and the 10% results can be valid. In particular, I have seen
variance of up to 15% in CPU time for consecutive runs of the same CPU-
saturating task on the SAME physical machine, due to the lack in Linux of
cache coloring considerations when allocating physical page frames for
virtual memory, and the resulting random affects on the performance
of the data cache. See https://en.wikipedia.org/wiki/Cache_coloring :
A virtual memory subsystem that lacks cache coloring is less deterministic
with regards to cache performance, as differences in page allocation
from one program run to the next can lead to large differences in
program performance
Page coloring is employed in operating systems such as Solaris, FreeBSD,
NetBSD and Windows NT.
[Note the conspicuous absence of Linux from that list.]
Other sources of real-time contention should be considered, too.
Queuing delays in a file system due to encryption, journaling, block
allocation and placement, etc., might mask real-time measurement of CPU+cache.
Any potentially-competing activity such as graphical desktop environment,
use of network or video or audio, or crontab or tasks controlled by systemd,
should be minimized. It may be best to measure when the machine
has been booted to single-user mode.
Because of the impact of data cache performance, it is important to
state the CPU, RAM, and cache characteristics when measuring performance.
Such as: the beginning of /proc/cpuinfo:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 94
model name : Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
stepping : 3
microcode : 0xf0
cpu MHz : 3418.725
cache size : 6144 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
On Intel x86_64 Core CPUs with hyperthreading, then the two threads
per core compete for the 256KiB L2 cache per core.
On x86_64, then the CPUID instruction reports cache organization,
which can be interpreted, such as:
22 GenuineIntel
TLB/Cache: eax=76036301 ebx=00f0b6ff ecx=00000000 edx=00c30000
1 repeat for more info
63
03 dTLB: 4 KByte pages, 4-way, 64 entries
76 iTLB: 2M/4M pages, fully associative, 8 entries
ff Use CPUID leaf 4
b6
f0 64-byte prefetching
c3
Cache: eax=1c004121 ebx=01c0003f ecx=0000003f edx=00000000
1 Data Cache
1 Cache Level (starts at 1)
1 Self-initializing
0 Fully associative
2 max # logical processors
8 max # physical cores
64 system coherency line size
1 physical line partitions
8 ways of associativity
64 number of sets
32768 total size
0 WBINVD/INVD acts on this level only
0 cache includes lower levels
0 complex cache indexing
Cache: eax=1c004122 ebx=01c0003f ecx=0000003f edx=00000000
2 Instruction Cache
1 Cache Level (starts at 1)
1 Self-initializing
0 Fully associative
2 max # logical processors
8 max # physical cores
64 system coherency line size
1 physical line partitions
8 ways of associativity
64 number of sets
32768 total size
0 WBINVD/INVD acts on this level only
0 cache includes lower levels
0 complex cache indexing
Cache: eax=1c004143 ebx=00c0003f ecx=000003ff edx=00000000
3 Unified Cache
2 Cache Level (starts at 1)
1 Self-initializing
0 Fully associative
2 max # logical processors
8 max # physical cores
64 system coherency line size
1 physical line partitions
4 ways of associativity
1024 number of sets
262144 total size
0 WBINVD/INVD acts on this level only
0 cache includes lower levels
0 complex cache indexing
Cache: eax=1c03c163 ebx=02c0003f ecx=00001fff edx=00000006
3 Unified Cache
3 Cache Level (starts at 1)
1 Self-initializing
0 Fully associative
16 max # logical processors
8 max # physical cores
64 system coherency line size
1 physical line partitions
12 ways of associativity
8192 number of sets
6291456 total size
0 WBINVD/INVD acts on this level only
1 cache includes lower levels
1 complex cache indexing
Cache: eax=00000000 ebx=00000000 ecx=00000000 edx=00000000
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue