Re: [PATCH v2 1/2] include/uapi/linux/swab.h: move default implementation for swab macros into asm-generic

Ignacio Encinas Rubio <ignacio@xxxxxxxxxxxx> · Fri, 21 Mar 2025 19:38:54 +0100

On 21/3/25 11:23, Arnd Bergmann wrote:
> On Thu, Mar 20, 2025, at 23:36, Ignacio Encinas Rubio wrote:
>> On 19/3/25 22:49, Arnd Bergmann wrote:
>>> On Wed, Mar 19, 2025, at 22:37, Ignacio Encinas Rubio wrote:
>>>> On 19/3/25 22:12, Arnd Bergmann wrote:
>>> Right, I do remember when we had a discussion about this maybe
>>> 15 years ago when gcc didn't have the builtins on all architectures
>>> yet, but those versions are long gone, and we never cleaned it up.
>>
>> I just had a chance to look at this and it looks a bit more complex than
>> I initially thought. ___constant_swab macros are used in more places
>> than I expected, and {little,big}_endian.h define their own macros that
>> are used elsewhere, ...
>>
>> It is not clear to me how to proceed here. I could:
>>
>>   1) Just remove ___constant_swab macros and replace them with
>>   __builtin_swap everywhere
>>
>>   2) Go a step further and evaluate removing __constant_htonl and
>>   relatives
>>
>> Let me know what you think is the best option :)
> 
> I think we can start enabling CONFIG_ARCH_USE_BUILTIN_BSWAP
> on all architectures and removing the custom versions
> from arch/*/include/uapi/asm/swab.h, which all seem to
> predate the compiler builtins and likely produce worse code.

This seems fine for some architectures but I don't think we can use
this approach for RISC-V. RISC-V code assumes that the bitmanip 
extension might not be available (see arch/riscv/include/asm/bitops.h).

The current approach [1] is to detect this at boot and patch the kernel 
to adapt it to the actual hardware running it (using specific 
instructions or not).

On the other hand, I tried using __builtin_swap for the RISC-V version 
as an alternative to the "optimized" one (instead of relying on
___constant_swab, see [2]) and I immediately got compilation errors. 

Some architectures seem to require definitions for __bswapsi2 and 
__bswapdi2 [3]. I'm guessing this happens for the architectures that
don't require bit manipulation instructions but have them as extensions.

arm,csky,mips and xtensa seem to fit this description as they 
feature their own __bswapsi2 implementations. Note that they simply
call ___constant_swab or are ___constant_swab written in assembly
language [4] [5].

Unless I'm missing something, it seems to me that using compiler 
builtins (at least for RISC-V, and potentially others) is even more 
problematic than keeping ___constant_swab around. What do you think, 
should we keep patch 1 after all?

We could remove __arch_swab for architectures that always assume bit 
manipulation instructions availability, but then the kernel would fall
back into ___constant_swab when CONFIG_ARCH_USE_BUILTIN_BSWAP=n. Turning
their custom implementations into 

	#define __arch_swabXY __builtin_bswapXY

would solve this issue, but I'm not sure it is an acceptable approach.

Thanks!

[1] https://lore.kernel.org/all/ce034f2b-2f6e-403a-81f1-680af4c72929@xxxxxxxx/
[2] https://lore.kernel.org/all/20250319-riscv-swab-v2-2-d53b6d6ab915@xxxxxxxxxxxx/
[3] https://gcc.gnu.org/onlinedocs/gcc-13.3.0/gccint.pdf
[4] https://lore.kernel.org/all/20230512164815.2150839-1-jcmvbkbc@xxxxxxxxx/
[5] https://lore.kernel.org/all/1664437198-31260-3-git-send-email-yangtiezhu@xxxxxxxxxxx/