On Sat, Jul 27, 2024 at 3:37 PM pifminns deettnta via Gcc <gcc@xxxxxxxxxxx> wrote: > > using uint_least64_t = __UINT_LEAST64_TYPE__; > > uint_least64_t testbswap(uint_least64_t a) noexcept > { > return __builtin_bswap64(a); > } > > clang: > https://godbolt.org/z/z8GTsazf4 > > > _Z9testbswapm: > revb.d $a0, $a0 > ret > > > GCC: > > https://godbolt.org/z/PabfxP9ve > > _Z9testbswapm: > revb.4h $r4,$r4 > revh.d $r4,$r4 > jr $r1 > > It should just use revb.d for bswap, not separate them into two. The code generation is not wrong, just not as good. GCC swaps bytes inside 4 half words and then swaps the half words. I looked into the history of GCC's code generation here and noticed it comes from the original port when committed upstream. even the bswap32 is also done using 2 instructions. Now I am suspecting is the original Loongson ISA didn't have revb.d/revb.2w when the GCC port was done and it was added afterwards and GCC port was never updated to use the new instructions.