[...]
gcc version 13.2.1 20231011 (Red Hat 13.2.1-4) (GCC)
From Fedora 38. So "a bit" newer :P
I'll retry with newer toolchain.
FWIW, with the code fix and the original compiler:
Fork, order-0, Apple M2:
| kernel | mean_rel | std_rel |
|:----------------------|-----------:|----------:|
| mm-unstable | 0.0% | 0.8% |
| hugetlb-rmap-cleanups | 1.3% | 2.0% |
| fork-batching | 4.3% | 1.0% |
Fork, order-9, Apple M2:
| kernel | mean_rel | std_rel |
|:----------------------|-----------:|----------:|
| mm-unstable | 0.0% | 0.8% |
| hugetlb-rmap-cleanups | 0.9% | 0.9% |
| fork-batching | -37.3% | 1.0% |
Fork, order-0, Ampere Altra:
| kernel | mean_rel | std_rel |
|:----------------------|-----------:|----------:|
| mm-unstable | 0.0% | 0.7% |
| hugetlb-rmap-cleanups | 3.2% | 0.7% |
| fork-batching | 5.5% | 1.1% |
Fork, order-9, Ampere Altra:
| kernel | mean_rel | std_rel |
|:----------------------|-----------:|----------:|
| mm-unstable | 0.0% | 0.1% |
| hugetlb-rmap-cleanups | 0.5% | 0.1% |
| fork-batching | -10.4% | 0.1% |
I just gave it another quick benchmark run on that Intel system.
hugetlb-rmap-cleanups -> fork-batching
order-0: 0.014114 -> 0.013848
-1.9%
order-9: 0.014262 -> 0.009410
-34%
Note that I disable SMT and turbo, and pin the test to one CPU, to make
the results as stable as possible. My kernel config has anything related
to debugging disabled.
--
Cheers,
David / dhildenb