Fedora 32 System-Wide Change proposal: Build Python 3 to statically link with libpython3.8.a for better performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://fedoraproject.org/wiki/Changes/PythonStaticSpeedup

== Summary ==
Python 3 traditionally in Fedora was built with a shared library
libpython3.?.so and the final binary was dynamically linked against
that shared library. This change is about creating the static library
and linking the final python3 binary against it, as it provides
significant performance improvement, up to 27% depending on the
workload. The static library will not be shipped. The shared library
will continue to exist in a separate subpackage. In essence, python3
will no longer depend on libpython.

== Owner ==
* Name: [[User:Cstratak| Charalampos Stratakis]], [[User:Vstinner|
Victor Stinner]], [[User:Churchyard| Miro Hrončok]]
* Email: python-maint@xxxxxxxxxx


== Detailed Description ==

When we compile the python3 package on Fedora (prior to this change),
we create the libpython3.?.so shared library and the final python3
binary (<code>/usr/bin/python3</code>) is dynamically linked against
it. However by building the libpython3.?.a static library and
statically linking the final binary against it, we can achieve a
performance gain of 5% to 27% depending on the workload. Link time
optimizations and profile guided optimizations also have a greater
impact when python3 is linked statically.

Since Python 3.8,
[https://docs.python.org/3.8/whatsnew/3.8.html#debug-build-uses-the-same-abi-as-release-build
C extensions must no longer be linked to libpython by default].
Applications embedding Python now need to utilize the --embed flag for
python3-config to be linked to libpython. During the
[[Changes/Python3.8|Python 3.8 upgrade and rebuilds]] we've uncovered
various cases of packages linking to libpython implicitly through
various hacks within their buildsystems and fixed as many as possible.
However, there are legitimate reasons to link an application to
libpython and for those cases libpython should be provided so
applications that embed Python can continue to do so.

This mirrors the Debian/Ubuntu way of building Python, where they
offer a statically linked binary and an additional libpython
subpackage. The libpython subpackage will be created and python3-devel
will depend on it, so packages that embed Python will keep working.

The change was first done in Debian and Ubuntu years ago, followed by
Python 3.8. manylinux1 and manylinux2010 ABI don't link C extensions
to libpython either (to support Debian/Ubuntu).

By applying this change, libpython's namespace will be separated from
Python's, so '''C extension which are still linked to libpython'''
might experience side effects or break.

There is one exception for C extensions. If an application is linked
to libpython in order to embed Python, C extensions used only within
this application can continue to be linked to libpython.

Currently there is no upstream option to build the static library, as
well as the shared one and statically link the final binary to it, so
we have to rely on a downstream patch to achieve it. We plan to work
with upstream to incorporate the changes there as well.

Before the change, python3.8 is dynamically linked to libpython3.8:

<pre>
+-------------------+
|                   |
|                   |         +--------------------+
|  libpython3.8.so  <---------+ /usr/bin/python3.8 |
|                   |         +--------------------+
|                   |
+-------------------+
</pre>

After the change, python3.8 is statically linked to libpython3.8:

<pre>
                              +-----------------------+
                              |                       |
                              |   /usr/bin/python3.8  |
                              |                       |
+-------------------+         | +-------------------+ |
|                   |         | |                   | |
|                   |         | |                   | |
|  libpython3.8.so  |         | |  libpython3.8.a   | |
|                   |         | |                   | |
|                   |         | |                   | |
+-------------------+         | +-------------------+ |
                              +-----------------------+
</pre>

As a negative side effect, when both libpython3.8.so and
/usr/bin/python3.8 are installed, the filesystem footprint will be
slightly increased (libpython3.8.so on Python 3.8.0, x86_64 is ~3.4M).
OTOH only a very small amount of packages will depend on
libpython3.8.so.

== Benefit to Fedora ==
Python's performance will increase significantly depending on the
workload. Since many core components of the OS also depend on Python
this could lead to an increase in their performance as well, however
individual benchmarks will need to be conducted to verify the
performance gain for those components.

[https://pyperformance.readthedocs.io/ pyperformance] results,
ignoring differences smaller than 5%:

(see wiki page for table)

== Scope ==
* Proposal owners:
** Review and merge the
[https://src.fedoraproject.org/rpms/python3/pull-request/133 pull
request with the implementation].
** Go through the Python C extension packages that are linked to
libpython and test if things work correctly. A copr repository will be
provided for testing.

* Other developers: Other developers are encouraged to test the new
statically linked python3 and check if their package works as expected
* Release engineering: [https://pagure.io/releng/issue/8953 #8953]
This change does not require a mass rebuild, however a rebuild of the
affected packages will be required. The affected packages will be
rebuilt in copr first.
* Policies and guidelines: The packaging guidelines will need to be
updated to explicitly mention that C extensions should not be linked
to libpython, and that the python3 binary is statically linked.
* Trademark approval: N/A (not needed for this Change)

== Upgrade/compatibility impact ==
Affected package maintainers should verify that their packages work as
expected and the only impact the end users should see is a performance
increase for workloads relying on Python.

== How To Test ==
Copr repo with instructions:
https://copr.fedorainfracloud.org/coprs/g/python/Python3_statically_linked/

=== Package changes test ===
The change will bring the new <code>libpython3</code> subpackage as a
dependency of <code>python3-devel</code>.

Test that it's installed:
<pre>
$ rpm -q libpython3
</pre>

Test that it's uninstalled if <code>python3-devel</code> is removed:
<pre>
$ dnf remove python3-devel
</pre>

Test that <code>python3-libs</code> no longer includes the libpython
shared library.
<pre>
$ rpm -ql python3-libs | grep libpython3
</pre>

=== Dynamic linker test ===

To check that the python3.8 program is not linked to libpython, ldd
can be used. For example, Python 3.7 will still be linked to
libpython:

<pre>
$ ldd /usr/bin/python3.7|grep libpython
libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (0x00007fbb57333000)
</pre>

But python3.8 will no longer be linked to libpython:

<pre>
$ ldd /usr/bin/python3.8|grep libpython
</pre>

=== Performance test ===

The performance speedup can be measured using the official Python
benchmark suite [https://pyperformance.readthedocs.io/ pyperformance]:
see [https://pyperformance.readthedocs.io/usage.html#run-benchmarks
Run benchmarks].

=== Namespace test ===

The following script can be used to verify that the change is in effect:

<pre>
import ctypes
import sys

EMPTY_TUPLE_SINGLETON = ()

def get_empty_tuple(lib):
    # Call PyTuple_New(0)
    func = lib.PyTuple_New
    func.argtypes = (ctypes.c_ssize_t,)
    func.restype = ctypes.py_object
    return func(0)

def test_lib(libname, lib):
    obj = get_empty_tuple(lib)
    if obj is EMPTY_TUPLE_SINGLETON:
        print("%s: SAME namespace" % libname)
    else:
        print("%s: DIFFERENT namespace" % libname)

def test():
    program = ctypes.pythonapi

    if hasattr(sys, 'abiflags'):
        abiflags = sys.abiflags
    else:
        # Python 2
        abiflags = ''
    ver = sys.version_info
    filename = ('libpython%s.%s%s.so.1.0'
                % (ver.major, ver.minor, abiflags))
    libpython = ctypes.cdll.LoadLibrary(filename)

    test_lib('program', program)
    test_lib('libpython', libpython)

test()
</pre>

Output before the change:
<pre>
program: SAME namespace
libpython: SAME namespace
</pre>

Output after the change:

<pre>
program: SAME namespace
libpython: DIFFERENT namespace
</pre>

== User Experience ==
Python based workloads should see a performance gain of up to 27%.

== Dependencies ==
While this specific change is not dependent on anything else, we would
like to ensure that all the packages that link to libpython continue
to work as expected.

Currently (30/10/2019) 118 packages on rawhide depend on libpython.

Result of the "repoquery --repo=rawhide --source --whatrequires
'libpython3.8.so.1.0()(64bit)' " command on Fedora Rawhide, x86_64:

*COPASI
*Io-language
*OpenImageIO
*YafaRay
*antimony
*blender
*boost
*calamares
*calibre
*cantor
*ceph
*clingo
*condor
*createrepo_c
*csound
*cvc4
*dionaea
*dmlite
*domoticz
*fontforge
*freecad
*gdb
*gdcm
*gdl
*getdp
*glade
*globus-net-manager
*glom
*gnucash
*gpaw
*hamlib
*hokuyoaist
*hugin
*insight
*kdevelop-python
*kicad
*kitty
*krita
*lammps
*ldns
*libCombine
*libarcus https://src.fedoraproject.org/rpms/libarcus/pull-request/8
*libarcus-lulzbot
*libbatch
*libcec
*'''libcomps'''
*'''libdnf'''
*libftdi
*libkml
*libkolabxml
*libldb
*libnuml
*libpeas
*libplist
*libreoffice
*librepo
*libsavitar
*libsbml
*libsedml
*libtalloc
*libyang
*libyui-bindings
*link-grammar
*lldb
*mathgl
*med
*mod_wsgi
*nautilus-python
*nbdkit
*nest
*netgen-mesher
*neuron
*nextpnr
*nordugrid-arc
*nwchem
*openbabel
*openscap
*opentrep
*openvdb
*pam_wrapper
*paraview
*perl-Inline-Python
*pidgin
*pitivi
*plplot
*postgresql
*pynac
*pyotherside
*pythia8
*python-gstreamer1
*python-jep
*python-qt5
*<del>python3</del>
*qgis
*qpid-dispatch
*qpid-proton
*rdkit
*renderdoc
*rmol
*root
*samba
*scidavis
*sigil
*swift-lang
*texworks
*thunarx-python
*trademgen
*trellis
*unbound
*uwsgi
*vdr-epg-daemon
*vigra
*'''vim'''
*vrpn
*vtk
*weechat
*znc

Packages in '''bold''' are the ones present in the default
docker/podman "fedora:rawhide" image.

== Contingency Plan ==
* Contingency mechanism: If issues appear that cannot be fixed in a
timely manner the change can be easily reverted and will be considered
again for the next fedora release. Also a proper upgrade path
mechanism will be provided in case of reversion, since
libpython.3.?.so will be a separate package with this change.
* Contingency deadline: Before the beta freeze of Fedora 32 (2020-02-25)
* Blocks release? Yes
* Blocks product? None

== Documentation ==
The documentation will be reflected in the changes for the python
packaging guidelines.



-- 
Ben Cotton
He / Him / His
Fedora Program Manager
Red Hat
TZ=America/Indiana/Indianapolis
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux