Re: [PATCH v4 2/5] m68k/bitops: use __builtin_{clz,ctzl,ffs} to evaluate constant expressions

Finn Thain <fthain@xxxxxxxxxxxxxx> · Mon, 5 Feb 2024 10:13:26 +1100 (AEDT)

On Sun, 4 Feb 2024, Vincent MAILHOL wrote:

Sorry for the late feedback, I did not have much time during weekdays.

On Monday. 29 Jan. 2024 at 07:34, Finn Thain <fthain@xxxxxxxxxxxxxx> wrote:
On Sun, 28 Jan 2024, Vincent MAILHOL wrote:
The asm is meant to produce better results when the argument is not
a constant expression.

Is that because gcc's implementation has to satisfy requirements that are
excessively stringent for the kernel's purposes? Or is it a compiler
deficiency only affecting certain architectures?

I just guess that GCC guys followed the Intel datasheet while the
kernel guys like to live a bit more dangerously and rely on some not
so well defined behaviours... But I am really not the person to whom
you should ask.

I just want to optimize the constant folding and this is the only
purpose of this series. I am absolutely not an asm. That's also why I
am reluctant to compare the asm outputs.

How does replacing asm with a builtin prevent constant folding?

... The only thing I am not ready to do is to compare the produced
assembly code and confirm whether or not it is better to remove asm
code.

If you do the comparison and find no change, you get to say so in the
commit log, and everyone is happy.

Without getting into details, here is a quick comparaisons of gcc and
clang generated asm for all the bitops builtin:

  https://godbolt.org/z/Yb8nMKnYf

To the extent of my limited knowledge, it looks rather OK for gcc, but
for clang... seems that this is the default software implementation.

So are we fine with the current patch? Or maybe clang support is not
important for m68k? I do not know so tell me :)

Let's see if I understand.

You are proposing that the kernel source carry an unquantified 
optimization, with inherent complexity and maintenance costs, just for the 
benefit of users who choose a compiler that doesn't work as well as the 
standard compiler. Is that it?

At some point in the future when clang comes up to scrach with gcc and the 
builtin reaches parity with the asm, I wonder if you will then remove both 
your optimization and the asm, to eliminate the afore-mentioned complexity 
and maintenance costs. Is there an incentive for that work?