On Wed, Jul 24, 2024 at 02:26:32PM GMT, David Laight wrote: > The changes to minmax.h that changed the type check to a signedness > check significantly increased the length of the expansion. > In some cases it has also significantly increased compile type. > This is particularly noticeable for nested expansions. > > These changes reduce the expansions somewhat. > The biggest change is the last patch that directly implements > min3() and max3() rather than using a nested expansion. > > Further significant improvements can be made by removing the > requirement that min(1,2) be 'constant enough' for an array size. > Instead supporting MIN() and MAX() for constants only with a result > that is valid for a static initialiser. > However that needs an initial change to the few files that have > local versions of MIN() or MAX(). > > > David Laight (7): > minmax: Put all the clamp() definitions together > minmax: Use _Static_assert() instead of static_assert() > compiler.h: Add __if_constexpr(expr, if_const, if_not_const) > minmax: Simplify signedness check > minmax: Factor out the zero-extension logic from umin/umax. > minmax: Optimise _Static_assert() check in clamp(). > minmax: minmax: Add __types_ok3() and optimise defines with 3 > arguments > > include/linux/compiler.h | 65 +++++---------- > include/linux/minmax.h | 176 ++++++++++++++++++++------------------- > 2 files changed, 113 insertions(+), 128 deletions(-) > > -- > 2.17.1 > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) > FWIW, I measured ~26.7s user time improvement with CONFIG_XEN set, leaving another ~15.9s vs. reverted case. It reduces the time taken on the egregious arch/x86/xen/setup.c down to .7s which is comparable to the reverted case. If CONFIG_XEN is not set, we see a ~32.5s user time improvement, leaving another ~15.8s vs. reverted case. So this series definitely improves things, but there is still more that could be done. I share Arnd's concerns about the need to be _very_ careful about very sensitive and complicated logic in these macros so we need to be very sure we're not breaking anything here before we proceed. Be good to really kick the tyres if we can. My rough numbers below, on a 32-core intel i9-14900KF box using defconfig + a small number of debug flags I use for compiler development: ## CONFIG_XEN enabled with original min()/max() impl make 1639.84s user 95.25s system 2358% cpu 1:13.58 total make 1638.79s user 96.40s system 2366% cpu 1:13.33 total make 1635.00s user 96.44s system 2359% cpu 1:13.37 total ## CONFIG_XEN enabled with this patch series make 1609.13s user 92.71s system 2321% cpu 1:13.30 total make 1611.94s user 92.45s system 2366% cpu 1:12.01 total make 1612.51s user 92.35s system 2368% cpu 1:11.99 total ## CONFIG_XEN enabled with original min()/max() changes reverted make 1588.46s user 92.33s system 2430% cpu 1:09.16 total make 1598.57s user 93.49s system 2419% cpu 1:09.94 total make 1598.99s user 92.49s system 2419% cpu 1:09.91 total ## No CONFIG_XEN with original min/max() impl make 1570.76s user 91.62s system 2365% cpu 1:10.28 total make 1573.93s user 92.74s system 2367% cpu 1:10.40 total make 1576.97s user 92.33s system 2363% cpu 1:10.62 total ## No CONFIG_XEN with this patch series make 1539.13s user 88.16s system 2347% cpu 1:09.31 total make 1541.55s user 88.84s system 2355% cpu 1:09.22 total make 1543.44s user 89.76s system 2355% cpu 1:09.34 total ## No CONFIG_XEN with original min()/max() changes reverted make 1524.97s user 89.84s system 2399% cpu 1:07.31 total make 1521.01s user 88.99s system 2391% cpu 1:07.32 total make 1530.75s user 89.65s system 2389% cpu 1:07.83 total