F41 Change Proposal - Python Built with gcc -03 (self-contained)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Wiki - https://fedoraproject.org/wiki/Changes/Python_built_with_gcc_O3
Discussion.fpo -
https://discussion.fedoraproject.org/t/f41-change-proposal-python-built-with-gcc-03-self-contained/112743


This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes
process, proposals are publicly announced in order to receive
community feedback. This proposal will only be implemented if approved
by the Fedora Engineering Steering Committee.



== Summary ==
Instead of [https://docs.fedoraproject.org/en-US/packaging-guidelines/#_compiler_flags
Fedora's default `-O2` compiler flag], we will use `-O3` to build
CPython.
This only impacts the interpreter and Python standard library, not any
3rd party extension modules built as RPM or on developer machines.
This aligns with the way Python is built upstream.
According to our performance measurements, it makes Python
significantly faster (pyperformance geometric mean: 1.04x faster).

== Owner ==
* Name: [[User:churchyard|Miro Hrončok]]
* Email: mhroncok@xxxxxxxxxx


== Detailed Description ==

We will replace the `-O2` compiler flag with `-O3` when building the
python3.13 package. This change may be backported to older Pythons if
desired. [[Changes/Python3.13|Python 3.13 should be the main Python
version in Fedora 41+]].

The [https://docs.fedoraproject.org/en-US/packaging-guidelines/#_compiler_flags
Fedora packaging guidelines] about compiler flags explicitly say:

''> Overriding these flags for performance optimizations (for
instance, `-O3` instead of `-O2`) is generally discouraged. If you can
present benchmarks that show a significant speedup for this particular
code, this could be revisited on a case-by-case basis.''

This change proposal presents such benchmarks and a case for Python to
use `-O3`.

This change is limited to CPython interpreter and extension modules
from the Python standard library only thanks to
[[Changes/Python_Extension_Flags_Reduction]] (since Fedora 39). Other
Python extension modules will remain bulidng as before, e.g. in RPM
packages, they will still be built with `-O2`, unless Fedora changes
that globally. The extension modules built with `-O2` still work with
Python built with `-O3`.

== Feedback ==


== Benefit to Fedora ==
Upstream already builds Python with `-O3` by default. Fedora's Python
built with `-O3` is faster (1.04x):

{| class="wikitable sortable"
|+ Benchmark with python3.12-3.12.2-3.fc41
|-
! Benchmark                        !! -O2          !! -O3      !!
Change       !! Significance
|-
| 2to3                             || 465 ms       || 446 ms   ||
1.04x faster || Significant (t=21.72)
|-
| async_generators                 || 853 ms       || 784 ms   ||
1.09x faster || Significant (t=36.61)
|-
| async_tree_cpu_io_mixed          || 1.19 sec     || 1.11 sec ||
1.08x faster || Significant (t=13.38)
|-
| async_tree_cpu_io_mixed_tg       || 1.17 sec     || 1.09 sec ||
1.08x faster || Significant (t=18.69)
|-
| async_tree_eager                 || 202 ms       || 189 ms   ||
1.07x faster || Significant (t=7.99)
|-
| async_tree_eager_cpu_io_mixed    || 727 ms       || 664 ms   ||
1.09x faster || Significant (t=18.56)
|-
| async_tree_eager_cpu_io_mixed_tg || 633 ms       || 558 ms   ||
1.13x faster || Significant (t=24.53)
|-
| async_tree_eager_io              || 1.72 sec     || 1.68 sec ||
1.03x faster || Significant (t=6.13)
|-
| async_tree_eager_io_tg           || 1.65 sec     || 1.62 sec ||
1.02x faster || Significant (t=4.65)
|-
| async_tree_eager_memoization     || 437 ms       || 422 ms   ||
1.04x faster || Significant (t=5.09)
|-
| async_tree_eager_memoization_tg  || 330 ms       || 322 ms   ||
1.03x faster || Significant (t=2.60)
|-
| async_tree_eager_tg              || 137 ms       || 125 ms   ||
1.09x faster || Significant (t=16.94)
|-
| async_tree_io                    || 1.64 sec     || 1.60 sec ||
1.02x faster || Significant (t=9.49)
|-
| async_tree_io_tg                 || 1.65 sec     || 1.61 sec ||
1.02x faster || Not significant
|-
| async_tree_memoization           || 895 ms       || 871 ms   ||
1.03x faster || Significant (t=3.73)
|-
| async_tree_memoization_tg        || 848 ms       || 836 ms   ||
1.01x faster || Not significant
|-
| async_tree_none                  || 718 ms       || 700 ms   ||
1.03x faster || Significant (t=6.90)
|-
| async_tree_none_tg               || 686 ms       || 659 ms   ||
1.04x faster || Significant (t=13.11)
|-
| asyncio_tcp                      || 757 ms       || 748 ms   ||
1.01x faster || Not significant
|-
| asyncio_tcp_ssl                  || 2.58 sec     || 2.56 sec ||
1.01x faster || Not significant
|-
| asyncio_websockets               || 419 ms       || 418 ms   ||
1.00x faster || Not significant
|-
| bench_mp_pool                    || 10.7 ms      || 10.7 ms  ||
1.00x faster || Not significant
|-
| bench_thread_pool                || 1.62 ms      || 1.61 ms  ||
1.01x faster || Not significant
|-
| chameleon                        || 12.2 ms      || 12.0 ms  ||
1.02x faster || Not significant
|-
| chaos                            || 113 ms       || 105 ms   ||
1.07x faster || Significant (t=46.23)
|-
| comprehensions                   || 37.4 us      || 35.1 us  ||
1.07x faster || Significant (t=49.72)
|-
| coroutines                       || 42.4 ms      || 41.4 ms  ||
1.02x faster || Significant (t=18.68)
|-
| coverage                         || 109 ms       || 104 ms   ||
1.05x faster || Significant (t=33.91)
|-
| create_gc_cycles                 || 1.84 ms      || 1.79 ms  ||
1.02x faster || Significant (t=5.50)
|-
| crypto_pyaes                     || 141 ms       || 127 ms   ||
1.11x faster || Significant (t=86.61)
|-
| dask                             || 766 ms       || 769 ms   ||
1.00x slower || Not significant
|-
| deepcopy                         || 619 us       || 614 us   ||
1.01x faster || Not significant
|-
| deepcopy_memo                    || 71.3 us      || 68.3 us  ||
1.04x faster || Significant (t=26.58)
|-
| deepcopy_reduce                  || 5.62 us      || 5.56 us  ||
1.01x faster || Not significant
|-
| deltablue                        || 5.76 ms      || 5.49 ms  ||
1.05x faster || Significant (t=7.97)
|-
| django_template                  || 62.8 ms      || 59.7 ms  ||
1.05x faster || Significant (t=27.05)
|-
| docutils                         || 4.38 sec     || 4.29 sec ||
1.02x faster || Significant (t=11.25)
|-
| fannkuch                         || 706 ms       || 667 ms   ||
1.06x faster || Significant (t=75.80)
|-
| float                            || 144 ms       || 137 ms   ||
1.05x faster || Significant (t=24.66)
|-
| gc_traversal                     || 5.73 ms      || 5.81 ms  ||
1.01x slower || Not significant
|-
| generators                       || 56.0 ms      || 58.2 ms  ||
1.04x slower || Significant (t=-16.25)
|-
| genshi_text                      || 40.8 ms      || 39.5 ms  ||
1.03x faster || Significant (t=17.64)
|-
| genshi_xml                       || 88.2 ms      || 86.3 ms  ||
1.02x faster || Significant (t=6.96)
|-
| go                               || 223 ms       || 217 ms   ||
1.03x faster || Significant (t=19.92)
|-
| hexiom                           || 10.3 ms      || 9.76 ms  ||
1.05x faster || Significant (t=42.15)
|-
| html5lib                         || 109 ms       || 108 ms   ||
1.01x faster || Not significant
|-
| json_dumps                       || 17.4 ms      || 16.3 ms  ||
1.06x faster || Significant (t=45.38)
|-
| json_loads                       || 44.2 us      || 42.3 us  ||
1.04x faster || Significant (t=27.71)
|-
| logging_format                   || 12.9 us      || 12.4 us  ||
1.04x faster || Significant (t=9.81)
|-
| logging_silent                   || 176 ns       || 174 ns   ||
1.01x faster || Not significant
|-
| logging_simple                   || 11.4 us      || 11.0 us  ||
1.03x faster || Significant (t=9.94)
|-
| mako                             || 19.2 ms      || 18.1 ms  ||
1.06x faster || Significant (t=54.89)
|-
| mdp                              || 4.46 sec     || 4.33 sec ||
1.03x faster || Significant (t=30.14)
|-
| meteor_contest                   || 189 ms       || 167 ms   ||
1.13x faster || Significant (t=60.31)
|-
| nbody                            || 157 ms       || 153 ms   ||
1.03x faster || Significant (t=4.34)
|-
| nqueens                          || 153 ms       || 140 ms   ||
1.09x faster || Significant (t=63.60)
|-
| pathlib                          || 32.9 ms      || 32.6 ms  ||
1.01x faster || Not significant
|-
| pickle                           || 18.6 us      || 16.0 us  ||
1.16x faster || Significant (t=23.88)
|-
| pickle_dict                      || 45.8 us      || 44.6 us  ||
1.03x faster || Significant (t=16.51)
|-
| pickle_list                      || 6.86 us      || 6.59 us  ||
1.04x faster || Significant (t=19.65)
|-
| pickle_pure_python               || 515 us       || 505 us   ||
1.02x faster || Not significant
|-
| pidigits                         || 285 ms       || 284 ms   ||
1.00x faster || Not significant
|-
| pprint_pformat                   || 2.72 sec     || 2.54 sec ||
1.07x faster || Significant (t=40.28)
|-
| pprint_safe_repr                 || 1.34 sec     || 1.25 sec ||
1.08x faster || Significant (t=58.43)
|-
| pyflate                          || 738 ms       || 724 ms   ||
1.02x faster || Not significant
|-
| python_startup                   || 15.5 ms      || 15.3 ms  ||
1.01x faster || Not significant
|-
| python_startup_no_site           || 11.2 ms      || 11.0 ms  ||
1.01x faster || Not significant
|-
| raytrace                         || 549 ms       || 514 ms   ||
1.07x faster || Significant (t=45.37)
|-
| regex_compile                    || 245 ms       || 233 ms   ||
1.05x faster || Significant (t=13.30)
|-
| regex_dna                        || 269 ms       || 268 ms   ||
1.00x faster || Not significant
|-
| regex_effbot                     || 4.83 ms      || 4.95 ms  ||
1.03x slower || Significant (t=-12.52)
|-
| regex_v8                         || 33.7 ms      || 33.1 ms  ||
1.02x faster || Not significant
|-
| richards                         || 75.7 ms      || 71.9 ms  ||
1.05x faster || Significant (t=18.30)
|-
| richards_super                   || 85.2 ms      || 81.4 ms  ||
1.05x faster || Significant (t=31.25)
|-
| scimark_fft                      || 662 ms       || 587 ms   ||
1.13x faster || Significant (t=71.10)
|-
| scimark_lu                       || 199 ms       || 190 ms   ||
1.04x faster || Significant (t=26.77)
|-
| scimark_monte_carlo              || 123 ms       || 117 ms   ||
1.05x faster || Significant (t=37.45)
|-
| scimark_sor                      || 217 ms       || 210 ms   ||
1.04x faster || Significant (t=10.68)
|-
| scimark_sparse_mat_mult          || 8.51 ms      || 7.42 ms  ||
1.15x faster || Significant (t=62.99)
|-
| spectral_norm                    || 196 ms       || 183 ms   ||
1.07x faster || Significant (t=95.78)
|-
| sqlalchemy_declarative           || 239 ms       || 234 ms   ||
1.02x faster || Significant (t=4.81)
|-
| sqlalchemy_imperative            || 33.1 ms      || 33.4 ms  ||
1.01x slower || Not significant
|-
| sqlglot_normalize                || 197 ms       || 187 ms   ||
1.05x faster || Significant (t=39.81)
|-
| sqlglot_optimize                 || 97.1 ms      || 91.3 ms  ||
1.06x faster || Significant (t=47.14)
|-
| sqlglot_parse                    || 2.29 ms      || 2.18 ms  ||
1.05x faster || Significant (t=14.70)
|-
| sqlglot_transpile                || 2.79 ms      || 2.67 ms  ||
1.04x faster || Significant (t=11.76)
|-
| sqlite_synth                     || 3.97 us      || 3.90 us  ||
1.02x faster || Not significant
|-
| sympy_expand                     || 833 ms       || 802 ms   ||
1.04x faster || Significant (t=19.41)
|-
| sympy_integrate                  || 34.7 ms      || 33.8 ms  ||
1.03x faster || Significant (t=9.99)
|-
| sympy_str                        || 511 ms       || 489 ms   ||
1.04x faster || Significant (t=18.17)
|-
| sympy_sum                        || 286 ms       || 278 ms   ||
1.03x faster || Significant (t=14.46)
|-
| telco                            || 12.6 ms      || 11.7 ms  ||
1.08x faster || Significant (t=9.31)
|-
| tomli_loads                      || 3.91 sec     || 3.56 sec ||
1.10x faster || Significant (t=46.29)
|-
| tornado_http                     || 213 ms       || 212 ms   ||
1.01x faster || Not significant
|-
| typing_runtime_protocols         || 214 us       || 196 us   ||
1.09x faster || Significant (t=24.74)
|-
| unpack_sequence                  || 70.5 ns      || 66.0 ns  ||
1.07x faster || Significant (t=8.58)
|-
| unpickle                         || 24.3 us      || 22.0 us  ||
1.10x faster || Significant (t=10.67)
|-
| unpickle_list                    || 7.44 us      || 8.61 us  ||
1.16x slower || Significant (t=-45.10)
|-
| unpickle_pure_python             || 390 us       || 360 us   ||
1.08x faster || Significant (t=37.48)
|-
| xml_etree_generate               || 160 ms       || 145 ms   ||
1.10x faster || Significant (t=44.33)
|-
| xml_etree_iterparse              || 189 ms       || 180 ms   ||
1.05x faster || Significant (t=20.16)
|-
| xml_etree_parse                  || 275 ms       || 257 ms   ||
1.07x faster || Significant (t=20.58)
|-
| xml_etree_process                || 106 ms       || 98.6 ms  ||
1.08x faster || Significant (t=46.73)
|-
| Geometric mean                   ||              ||          ||
1.04x faster ||
|}

Generated by `pyperformance run -o Ox.json` and `pyperformance compare
-O table O2.json O3.json` on Fedora 40 x86_64 with rawhide-built
Python, python3.12-3.12.2-3.fc41, on Lenovo X1 Carbon 3rd gen.

The benchmark was performed on Python 3.12 because it uses 3rd party
Python packages lacking support for Python 3.13. Once it is possible
to run such a benchmark for Python 3.13, we will do so.

The benchmark was performed on x86_64. Until somebody presents a
contradicting benchmark (or gives explicit reason for us to measure
it), we believe the change makes sense on all architectures.

== Scope ==
* Proposal owners:
** Change python3.13 to build with `-O3` instead of `-O2`
** Backport the change to older Pythons if desired

* Other developers: no action expected, report bugs when found

* Release engineering: no action expected

* Policies and guidelines: this change is following the spirit of the guidelines

* Trademark approval: not needed for this Change

* Alignment with Community Initiatives: faster Python, happier users,
more contributors?


== Upgrade/compatibility impact ==
None expected.

== How To Test ==

To verify this change has landed, inspect the build.log of python3.13.
It should be built with `gcc ... -O3`.

To test this change, test Fedora as you would normally do and assert
there are no regressions.

Run benchmarks, and report slowdowns if found.


== User Experience ==

Faster Python, faster Fedora.


== Dependencies ==

* [[Changes/Python_Extension_Flags_Reduction]] landed in Fedora 39
* [[Changes/Python3.13]] is expected to land in Fedora 41. If not, we
will apply this on Python 3.12.

== Contingency Plan ==

* Contingency mechanism: revert the change, rebuild Python
* Contingency deadline: Final Freeze
* Blocks release? No


== Documentation ==
N/A (not a System Wide Change)

== Release Notes ==

-- 
Aoife Moloney

Fedora Operations Architect

Fedora Project

Matrix: @amoloney:fedora.im

IRC: amoloney
--
_______________________________________________
devel-announce mailing list -- devel-announce@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-announce-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel-announce@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
--
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux