Re: [PATCH v3 1/5] m68k/bitops: force inlining of all bitops functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue. 2 janv. 2024 at 19:28, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:

Hi Vincent,

Thanks for your patch!

Thanks for the review and for running the benchmark.

On Sun, Dec 17, 2023 at 8:13 AM Vincent Mailhol
<mailhol.vincent@xxxxxxxxxx> wrote:
The inline keyword actually does not guarantee that the compiler will
inline a functions. Whenever the goal is to actually inline a
function, __always_inline should always be preferred instead.

On an allyesconfig, with GCC 13.2.1, it saves roughly 5 KB.

  $ size --format=GNU vmlinux.before vmlinux.after
        text       data        bss      total filename
    60449738   70975612    2288988  133714338 vmlinux.before
    60446534   70972412    2289596  133708542 vmlinux.after

With gcc 9.5.0-1ubuntu1~22.04, the figures are completely different
(i.e. a size increase):

Those results are not normal, there should not be such a big
discrepancy between two versions of the same compiler. I double
checked everything and found out that I made a mistake when computing
the figures: not sure what exactly, but at some point, the ASLR seeds
(or other similar randomization feature) got reset and so, the
decrease I witnessed was just a "lucky roll".

After rerunning the benchmark (making sure to keep every seeds), I got
similar results as you:

        text       data        bss      total filename
    60449738   70975356    2288988  133714082
vmlinux_allyesconfig.before_this_series
    60446534   70979068    2289596  133715198
vmlinux_allyesconfig.after_first_patch
    60429746   70979132    2291676  133700554
vmlinux_allyesconfig.final_second_patch

Note that there are still some kind of randomness on the data segment
as shown in those other benchmarks I run:

        text       data        bss      total filename
    60449738   70976124    2288988  133714850
vmlinux_allyesconfig.before_this_series
    60446534   70980092    2289596  133716222
vmlinux_allyesconfig.after_first_patch
    60429746   70979388    2291676  133700810
vmlinux_allyesconfig.after_second_patch

        text       data        bss      total filename
    60449738   70975612    2288988  133714338
vmlinux_allyesconfig.before_this_series
    60446534   70980348    2289596  133716478
vmlinux_allyesconfig.after_first_patch
    60429746   70979900    2291676  133701322
vmlinux_allyesconfig.after_second_patch

But the error margin is within 1K.

So, in short, I inlined some functions which I shouldn't have. I am
preparing a v4 in which I will only inline the bit-find functions
(namely: __ffs(), ffs(), ffz(), __fls(), fls() and fls64()). Here are
the new figures:

        text       data        bss      total filename
    60453552   70955485    2288620  133697657
vmlinux_allyesconfig.before_this_series
    60450304   70953085    2289260  133692649
vmlinux_allyesconfig.after_first_patch
    60433536   70952637    2291340  133677513
vmlinux_allyesconfig.after_second_patch

N.B. The new figures were after a rebase, so do not try to compare
with the previous benchmarks. I will send the v4 soon, after I finish
to update the patch comments and double check things.

Concerning the other functions in bitops.h, there may be some other
ones worth a __always_inline. But I will narrow the scope of this
series only to the bit-find function. If a good samaritan wants to
investigate the other functions, go ahead!

Yours sincerely,
Vincent Mailhol




allyesconfig:

      text       data        bss      total filename
  58878600   72415994    2283652  133578246 vmlinux.before
  58882250   72419706    2284004  133585960 vmlinux.after

atari_defconfig:

      text       data        bss      total filename
   4112060    1579862     151680    5843602 vmlinux-v6.7-rc8
   4117008    1579350     151680    5848038
vmlinux-v6.7-rc8-1-m68k-bitops-force-inlining

The next patch offsets that for allyesconfig, but not for atari_defconfig.

Reference: commit 8dd5032d9c54 ("x86/asm/bitops: Force inlining of
test_and_set_bit and friends")

Please don't split lines containing tags.

Link: https://git.kernel.org/torvalds/c/8dd5032d9c54

Signed-off-by: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx>

Reviewed-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds





[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux