On Sat, 2024-07-27 at 17:38 -0700, Andrew Pinski via Gcc-help wrote: > On Sat, Jul 27, 2024 at 3:37 PM pifminns deettnta via Gcc > <gcc@xxxxxxxxxxx> wrote: > > > > using uint_least64_t = __UINT_LEAST64_TYPE__; > > > > uint_least64_t testbswap(uint_least64_t a) noexcept > > { > > return __builtin_bswap64(a); > > } > > > > clang: > > https://godbolt.org/z/z8GTsazf4 > > > > > > _Z9testbswapm: > > revb.d $a0, $a0 > > ret > > > > > > GCC: > > > > https://godbolt.org/z/PabfxP9ve > > > > _Z9testbswapm: > > revb.4h $r4,$r4 > > revh.d $r4,$r4 > > jr $r1 > > > > It should just use revb.d for bswap, not separate them into two. > > The code generation is not wrong, just not as good. > GCC swaps bytes inside 4 half words and then swaps the half words. > > I looked into the history of GCC's code generation here and noticed it > comes from the original port when committed upstream. even the bswap32 > is also done using 2 instructions. Now I am suspecting is the original > Loongson ISA didn't have revb.d/revb.2w when the GCC port was done and > it was added afterwards and GCC port was never updated to use the new > instructions. revb.2w and revb.d are available since day one. I've no idea why they weren't used. (Also strangely there's no revb.w, so on the [still insubstantial] 32-bit LoongArch we would have to use revb.2h + rotri.w. for bswap32, but it may change before a real 32-bit CPU is launched.) I'm preparing a patch. -- Xi Ruoyao <xry111@xxxxxxxxxxx> School of Aerospace Science and Technology, Xidian University