F40 Change Proposal: Optimized Binaries for the AMD64 Architecture (System-Wide)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Wiki -> https://fedoraproject.org/wiki/Changes/Optimized_Binaries_for_the_AMD64_Architecture

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes
process, proposals are publicly announced in order to receive
community feedback. This proposal will only be implemented if approved
by the Fedora Engineering Steering Committee.


== Summary ==

Additional paths will be inserted into the search path used for
executables on systems which have a compatible CPU.
Those additional paths will mirror the AMD64 "microarchitecture
levels" supported by the glibc-hwcaps mechanism: `x86-64-v2`,
`x86-64-v3`, `x86_64-v4`.
Systemd will be modified to insert the additional directories into the
`$PATH` environment variable (affecting all programs on the system)
and the equivalent internal mechanism in `systemd` (affecting what
executables are used by services).
Individual packages can provide optimized libraries via the
glibc-hwcaps mechanism and optimized executables via the extended
search path. This optimized code will be used if the CPU supports it.
''Which'' packages provide the optimized code and at which level will
be made by individual package maintainers based on benchmark results.



== Owner ==
* Name: [[User:Zbyszek| Zbigniew Jędrzejewski-Szmek]]

NOTE: I'm writing and filling this proposal on the last day allowed
for system-wide proposals. It is too large for one person. If you are
interested, please let me know or even add yourself to the list of
Owners. I would love to have more people working on this.



== Detailed Description ==


Fedora binaries for the AMD64 architecture are compiled with
code-generation flags that support almost all CPU variants. But newer
generations of processors gained additional instructions that may be
used to generate faster code. A vendor-independent x86-64 psABI
supplement defines four "microachitecture levels": `x86-64-v1` (the
baseline, our code targets this), `x86-64-v2` (+`SSE3`, CentoOS
targets this), `x86-64-v3` (+`AVX`), `x86-64-v4` (+`AVX512`) [1]. When
code is compiled for a higher microarchitecture level it will crash
(with `SIGILL`, "illegal instruction") on CPUs which do not support
it. Benchmark results show small differences in performance: usually
in the range from -5% to 10%, with no discernible difference for most
code, but '''some''' applications benefit, with gains of 120% in some
benchmarks [e.g. 2, 4].

Over the years, various people have expressed interest in raising the
required microarchitecture levels. But we have been very conservative
in making changes, because support is missing in many older CPUs that
are still in use, and in fact, even in some CPUs produced and sold
today. By raising the required level we would make Fedora completely
unusable on many machines. It also seems that recompiling ''all''
packages with the changed options would largely be a waste of
resources, because for most code it makes no difference. But for some
of the numerical or cryptographic code there are noticeable gains and
it seems to be worth the effort to provide optimized code. This also
makes Fedora more attractive to people interested in optimization.

The dynamic linker already has the `glibc-hwcaps` mechanism to load
optimized implementations of ''shared objects'' [3]. This means that
packages can provide optimized libraries and they linker will be
automatically load them from separate directories if appropriate.
(For AMD64, this is `/usr/lib64/glibc-hwcaps/x86-64-v{2,3,4}/`.)

To extend the glibc-hwcaps mechanism to ''executables'', `systemd`
will be modified to extend the search path with appropriate
directories. When started, it will check the CPU capabilities and
modify the executable search path it has internally and which is also
used to set `$PATH` for services. (For AMD64,
`/usr/bin/glibc-hwcaps/x86-64-v{2,3,4}/`.)

Note: the ELF format provides the IFUNC mechanism to dynamically
select a variant of a function (symbol) when an executable is loaded
[5]. This is in particular used to load code using specific CPU
instructions when those are supported. This mechanism is both more
general (because it allows arbitrary selection criteria), more
fine-grained (because there can be other variants than just a few
fixed microarchitecture levels), and more efficient (because only the
parts of the code that benefit from this need to be provided in
multiple variants). In particular, glibc already makes extensive use
of this to provide optimized code, which is then widely used by other
libraries and programs. This means that even though we compile code in
a way where the lowest baseline is supported, modern CPU instructions
are already widely used. This is one of the reasons why compiling for
a higher baseline often doesn't make any difference in benchmarks. The
IFUNC mechanism or an equivalent mechanism should generally be
preferred. Nevertheless, that needs to be implented in the program or
library itself, which is not trivial. The two mechanisms in this
Proposal are intended for the packages which do not support IFUNCs or
some other equivalent mechanism.

[1] https://hackweek.opensuse.org/all/projects/support-glibc-hwcaps-and-micro-architecture-package-generation<BR>
[2] https://gitlab.archlinux.org/archlinux/rfcs/-/blob/master/rfcs/0002-march.rst<BR>
[3] https://sourceware.org/pipermail/libc-alpha/2021-February/122207.html<BR>
[4] https://blog.centos.org/2023/08/centos-isa-sig-performance-investigation/<BR>
[5] https://jasoncc.github.io/gnu_gcc_glibc/gnu-ifunc.html<BR>

Glibc-hwcaps together with the new feature in systemd provide a
generic mechanism. It will be up to individual packages to actually
provide code which makes use of it. Individual package maintainers are
encouraged to benchmark their packages after recompilation, and
provide the optimized variants if useful. (I.e. the code in question
is measureably faster '''and''' the program is ran often enough for
this to make a difference.)

The Change Owners will implement the packaging changes for a few
packages while developing the general mechanism and will submit those
as pull requests. Other maintainers are asked to do the same for their
packages.

Optimized variants of programs and libraries MAY be packaged in a
separate subpackage. The general packaging rules should be applied,
i.e. a separate package or packages SHOULD be created if it is files
are large enough.

Available benchmark results [2,4] are narrow and not very convincing.
We should plan an evaluation of results after one release.  If it
turns out that the real gains are too small, we can scrap the effort.
On the other hand, we should also consider other architectures. For
example, microarchitecture levels `z{14,15}` for `s390x` or
`power{9,10}` for `ppc64le`. Other architectures are not included in
this Change Proposal to reduce its scope.


== Feedback ==


== Benefit to Fedora ==
The developers who are interested in this kind of optimization work
can perform it within Fedora, without having to build separate
repositories. The users who have the appropriate hardware will gain
performance benefits. Faster code is also more energy-efficient. The
change will be automatic and transparent to users.

Note that other distributions use higher microarchitecture levels. For
example RHEL 9 uses x86-64-v2 as the baseline, RHEL 10 will use
x86-64-v3, and other distros provide optimized variants (OpenSUSE,
Arch Linux). We implement the same change in Fedora in a way that is
scoped more narrowly, but should provide the same performance and
energy benefits.

== Scope ==
* Proposal owners:
** Extend systemd to set the executable search path using the same
criteria as the dynamic linker.
** Implement packaging changes for at least one package with a library
and at least one package with executables and submit this as pull
requests.
** Provide a pull request for the Packaging Guidelines to describe the
changes listed in Description above.

* Other developers:
** Do benchmarking and implement packaging changes for other packages
if beneficial.

* Release engineering: [https://pagure.io/releng/issue/11864 #11864]

* Policies and guidelines: TBD.

* Trademark approval: N/A (not needed for this Change)

* Alignment with Community Initiatives:


== Upgrade/compatibility impact ==
No impact.


== How To Test ==

* Use `/usr/bin/ld.so --help` to check which hwcaps are supported by the system.
* Install one or more packages which provide optimized code.
* Restart the system or re-login to reinitialize `$PATH`.
* Check that appropriate directories are present in `$PATH`.
* Run some benchmarks and check that the optimized code is indeed faster.


== User Experience ==
There should be no impact for users. If the optimized code is
available and installed for their hardware, various tasks may finish
faster and use less energy.


== Dependencies ==


== Contingency Plan ==
* Contingency mechanism: Undo the changes in packages which introduced
them and recompile.
* Contingency deadline: Any time.
* Blocks release? No.

== Documentation ==


== Release Notes ==
Packages which benefit from being compiled for higher AMD64
microarchitecture levels (`x86-64-v2`, `x86-64-v3`, `x86_64-v4`) are
now provided with optimized variants which will be used automatically
on appropriate CPUs. This includes: TBD1, TBD2, TBD3.




-- 
Aoife Moloney

Fedora Operations Architect

Fedora Project

Matrix: @amoloney:fedora.im

IRC: amoloney
--
_______________________________________________
devel-announce mailing list -- devel-announce@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-announce-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel-announce@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
--
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux