On 30 May 2017 at 14:57, Andrew Haley wrote: > On 30/05/17 11:28, Jonathan Wakely wrote: >> On 28 May 2017 at 12:15, Toebs Douglass wrote: >>> I would like to write a little about the new libatomic mechanism for >>> What I see however is that there is a way for me to avoid these costs >>> and return to the simple situation. My code has an abstraction layer, >>> and I can implement inline assembly for double-word CAS on 64-bit >>> platforms and use that instead of __atomic and __sync. >> >> On x86_64 can't you just use __sync_val_compare_and_swap with -mcx16? >> >> Since GCC 4.6 this always emits cmpxchg16b when compiled with -mcx16: >> >> int main() >> { >> __int128 i = 0; >> __sync_val_compare_and_swap(&i, 0, 1); >> } >> >> That still works with GCC 7. > > I just built GCC from trunk, and this gives me: > > mustang-b0:~ $ /scratch/gcc/trunk/install/bin/gcc zzz.c > /tmp/ccmc3Y73.o: In function `main': > zzz.c:(.text+0x20): undefined reference to `__sync_val_compare_and_swap_16' > collect2: error: ld returned 1 exit status > > I can't figure out where __sync_val_compare_and_swap_16 should be > defined. It's not in libatomic or libc or libgcc. But if you compile with -mcx16 then it should use cmpxchg16b instead of a library call. That only works for x86_64 (like the -mcx16 option itself). I don't know how to guarantee no library call for aarch64.