RE: [PATCH v4 2/5] m68k/bitops: use __builtin_{clz,ctzl,ffs} to evaluate constant expressions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Vincent MAILHOL 
Sent: 28 January 2024 13:28
...
Fair. My goal was not to point to the assembly code but to this sentence:

  However, for non constant expressions, the kernel's ffs() asm
  version remains better for x86_64 because, contrary to GCC, it
  doesn't emit the CMOV assembly instruction

The comment in the code is:
	 * AMD64 says BSFL won't clobber the dest reg if x==0; Intel64 says the
	 * dest reg is undefined if x==0, but their CPU architect says its
	 * value is written to set it to the same as before, except that the
	 * top 32 bits will be cleared.
I guess gcc isn't willing to act on hearsay!


I should have been more clear. Sorry for that.

But the fact remains, on x86, some of the asm produced more optimized
code than the builtin.

I actually suspect the asm versions predate the builtins.

This seems true. The __bultins were introduced in:

  generic: Implement generic ffs/fls using __builtin_* functions
  https://git.kernel.org/torvalds/c/048fa2df92c3

I was thinking of compiler versions not kernel source commits.
You'd need to be doing some serious software archaeology.

when the asm implementation already existed in m68k.

Does (or can) the outer common header use the __builtin functions
if no asm version exists?

Yes, this would be extremely easy. You just need to

  #include/asm-generic/bitops/builtin-__ffs.h
  #include/asm-generic/bitops/builtin-ffs.h
  #include/asm-generic/bitops/builtin-__fls.h
  #include/asm-generic/bitops/builtin-fls.h

and remove all the asm implementations. If you give me your green
light, I can do that change. This is trivial. The only thing I am not
ready to do is to compare the produced assembly code and confirm
whether or not it is better to remove asm code.

Thoughts?

Not for me to say.
But the builtins are likely to generate reasonable code.
So unless the asm can be better (like trusting undocumented
x86 cpu behaviour) using them is probably best.

For the ones you are changing it may be best to propose using
the builtins all the time.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux