F42 Change Proposal: Optimized Binaries for the AMD64 / x86_64 Architecture (v2) (self-contained)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Wiki - https://fedoraproject.org/wiki/Changes/Optimized_Binaries_for_the_AMD64_Architecture_v2
Discussion thread -
https://discussion.fedoraproject.org/t/f42-change-proposal-optimized-binaries-for-the-amd64-x86-64-architecture-v2-self-contained/142032

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes
process, proposals are publicly announced in order to receive
community feedback. This proposal will only be implemented if approved
by the Fedora Engineering Steering Committee.


== Summary ==
Individual packages can provide already optimized libraries via the
glibc-hwcaps mechanism. This approach will be extended to executables.
The package provides an optimized variant of a binary in a different
directory. A symlink to small program which replaces the binary in
`/usr/bin`. At runtime, this program will find the most appropriate
variant and execute it.

''Which'' packages provide the optimized code and at which level will
be made by individual package maintainers based on benchmark results.
A few programs/packages will be updated by the Change Owners to show
how the mechanism works.

== Owner ==
* Name: [[User:Zbyszek| Zbigniew Jędrzejewski-Szmek]]
* Name: [[User:Salimma| Michel Lind]]
* Name: José Relvas
* Emails: zbyszek@xxxxxxxxx, salimma@xxxxxxxxxxxxxxxxx


== Detailed Description ==
This is an updated version of
[[Changes/Optimized_Binaries_for_the_AMD64_Architecture]].

Fedora binaries for the AMD64 / x86_64 architecture are compiled with
code-generation flags that support almost all CPU variants. But newer
generations of processors gained additional instructions that may be
used to generate faster code. A vendor-independent x86-64 psABI
supplement defines four "microachitecture levels": `x86-64-v1` (the
baseline, our code targets this), `x86-64-v2` (+`SSE3`, CentoOS
targets this), `x86-64-v3` (+`AVX`), `x86-64-v4` (+`AVX512`) [1]. When
code is compiled for a higher microarchitecture level it will crash
(with `SIGILL`, "illegal instruction") on CPUs which do not support
it. Benchmark results show small differences in performance: usually
in the range from -5% to 10%, with no discernible difference for most
code, but '''some''' applications benefit, with gains of 120% in some
benchmarks [e.g. 2, 4].

Over the years, various people have expressed interest in raising the
required microarchitecture levels. But we have been very conservative
in making changes, because support is missing in many older CPUs that
are still in use, and in fact, even in some CPUs produced and sold
today. By raising the required level we would make Fedora completely
unusable on many machines. It also seems that recompiling ''all''
packages with the changed options would largely be a waste of
resources, because for most code it makes no difference. But for some
of the numerical or cryptographic code there are noticeable gains and
it seems to be worth the effort to provide optimized code. This also
makes Fedora more attractive to people interested in optimization.

The dynamic linker already has the `glibc-hwcaps` mechanism to load
optimized implementations of ''shared objects'' [3]. This means that
packages can provide optimized libraries and they linker will be
automatically load them from separate directories if appropriate.
(For AMD64, this is `/usr/lib64/glibc-hwcaps/x86-64-v{2,3,4}/`.)

'''This Change is about extending the glibc-hwcaps mechanism to
''executables''. A small helper binary is provided. A program in
`/usr/bin` (or another path) is symlinked to this helper. When
executed, the helper checks the capabilities of the CPU and searches
for the most appropriate variant of the target program in a separate
directory hierarchy. If then launches one of the optimized binaries or
the "generic" one compiled for the baseline.'''

This means that individual packages "opt in", by moving their binary
to the alternative directory hierarchy and replacing it by a symlink,
and also providing one or more optimized variants.

Note: the ELF format provides the IFUNC mechanism to dynamically
select a variant of a function (symbol) when an executable is loaded
[5]. This is in particular used to load code using specific CPU
instructions when those are supported. This mechanism is both more
general (because it allows arbitrary selection criteria), more
fine-grained (because there can be other variants than just a few
fixed microarchitecture levels), and more efficient (because only the
parts of the code that benefit from this need to be provided in
multiple variants). In particular, glibc already makes extensive use
of this to provide optimized code, which is then widely used by other
libraries and programs. This means that even though we compile code in
a way where the lowest baseline is supported, modern CPU instructions
are already widely used. This is one of the reasons why compiling for
a higher baseline often doesn't make any difference in benchmarks. The
IFUNC mechanism or an equivalent mechanism should generally be
preferred. Nevertheless, that needs to be implemented in the program
or library itself, which is not trivial. The mechanism in this
Proposal is intended for the code which do not use IFUNCs or some
other similar mechanism.

[1] https://hackweek.opensuse.org/all/projects/support-glibc-hwcaps-and-micro-architecture-package-generation<BR>
[2] https://gitlab.archlinux.org/archlinux/rfcs/-/blob/master/rfcs/0002-march.rst<BR>
[3] https://sourceware.org/pipermail/libc-alpha/2021-February/122207.html<BR>
[4] https://blog.centos.org/2023/08/centos-isa-sig-performance-investigation/<BR>
[5] https://jasoncc.github.io/gnu_gcc_glibc/gnu-ifunc.html<BR>

Glibc-hwcaps together with the new helper provide a generic mechanism.
It will be up to individual packages to actually provide code which
makes use of it. Individual package maintainers are encouraged to
benchmark their packages after recompilation, and provide the
optimized variants if useful. (I.e. the code in question is measurably
faster '''and''' the program is run often enough for this to make a
difference.)

The Change Owners will implement the packaging changes for a few
packages while developing the general mechanism and will submit those
as pull requests. Other maintainers are asked to do the same for their
packages if desired.

Optimized variants of programs and libraries MAY be packaged in a
separate subpackage. The general packaging rules should be applied,
i.e. a separate package or packages SHOULD be created if it is files
are large enough.

Available benchmark results [2,4] are narrow and not very convincing.
We should plan an evaluation of results after one release.  If it
turns out that the real gains are too small, we can scrap the effort.
On the other hand, we should also consider other architectures. For
example, microarchitecture levels `z{14,15}` for `s390x` or
`power{9,10}` for `ppc64le`. Other architectures are not included in
this Change Proposal to reduce its scope.

== Feedback ==


== Benefit to Fedora ==
The developers who are interested in this kind of optimization work
can perform it within Fedora, without having to build separate
repositories. The users who have the appropriate hardware will gain
performance benefits. Faster code is also more energy-efficient. The
change will be automatic and transparent to users.

Note that other distributions use higher microarchitecture levels. For
example RHEL 9 uses x86-64-v2 as the baseline, RHEL 10 uses x86-64-v3,
and other distros provide optimized variants
([https://en.opensuse.org/openSUSE:X86-64-Architecture-Levels
OpenSUSE], Arch Linux,
[https://ubuntu.com/blog/optimising-ubuntu-performance-on-amd64-architecture
Ubuntu]). We implement the same change in Fedora in a way that is
scoped more narrowly, and thus vastly cheaper in the sense of
development effort, code compilation time, storage and distribution
overhead, but should provide the same performance and energy benefits.

== Scope ==
* Proposal owners:
** Package [https://github.com/jrelvas-ipc/hwcaps-loader hwcaps-loader].
** Find some example packages to convert (the code must do "number
crunching" or string processing, and must not already use IFUNCs or
glibc-hwcaps or some other mechanism).
** Convert a few packages and submit the changes as pull requests.
** Submit a draft change to Packaging Guidelines
** Do benchmarks.

* Other developers:
** Consider converting some additional packages.
** Review and merge the Packaging Guidelines change

* Release engineering: [https://pagure.io/releng/issues #Releng issue number]

* Policies and guidelines: N/A (not needed for this Change)

* Trademark approval: N/A (not needed for this Change)

* Alignment with the Fedora Strategy:


== Upgrade/compatibility impact ==


== Early Testing (Optional) ==
Do you require 'QA Blueprint' support? N


== How To Test ==


== User Experience ==
The change should be invisible to users, except that some programs may
execute more quickly.


== Dependencies ==


== Contingency Plan ==
* Contingency mechanism: Revert changes in individual packages. This
can be either by the maintainers of those packages or by the Change
Owners using provenpackager privileges.
* Contingency deadline: any time really. The changes are independent
between packages, so we can trivially convert and uncovert individual
programs even after release.
* Blocks release? No

== Documentation ==
N/A (not a System Wide Change)

== Release Notes ==


-- 
Aoife Moloney

Fedora Operations Architect

Fedora Project

Matrix: @amoloney:fedora.im

IRC: amoloney

-- 
_______________________________________________
devel-announce mailing list -- devel-announce@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-announce-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel-announce@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
-- 
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux