On 30/05/17 12:28, Jonathan Wakely wrote: > On 28 May 2017 at 12:15, Toebs Douglass wrote: >> I would like to write a little about the new libatomic mechanism for >> What I see however is that there is a way for me to avoid these costs >> and return to the simple situation. My code has an abstraction layer, >> and I can implement inline assembly for double-word CAS on 64-bit >> platforms and use that instead of __atomic and __sync. > > On x86_64 can't you just use __sync_val_compare_and_swap with -mcx16? > > Since GCC 4.6 this always emits cmpxchg16b when compiled with -mcx16: > > int main() > { > __int128 i = 0; > __sync_val_compare_and_swap(&i, 0, 1); > } > > That still works with GCC 7. I can do this. It would I think also work for aarch64. In fact, thankyou *VERY* much for mentioning this, because it hadn't struck me - it will be *infinitely* better for me to use __sync on aarch64 than try to write the complex chunk of assembly needed on that platform, a task I am not competent to undertake. I like the __atomic API though - I can specify the memory order, it has the correct built in compiler barrier, and I can specify weak/strong.