Hej all. In GCC 7.1.0, the x86_64 specific option "-mcx16" has been removed. This means that the atomic intrinsics will no longer emit cmpxchg16b, the double-word CAS instruction. This means x86 will continue to support double-word CAS (it only needs cmpxchg8b), but x86_64 will not. As such, I need now on this platform to use in-line assembly. I am not however well-informed about assembly. I'm barely informed about C :-) This is the code I have; result = 0; __asm__ __volatile__ ( "lock;" /* make cmpxchg16b atomic */ "cmpxchg16b %0;" /* cmpxchg16b sets ZF on success */ "setz %4;" /* if ZF set, set result to 1 */ /* output */ : "+m" (pointer_to_destination[0]), "+m" (pointer_to_destination[1]), "+a" (pointer_to_compare[0]), "+d" (pointer_to_compare[1]), "=q" (result) /* input */ : "b" (pointer_to_new_destination[0]), "c" (pointer_to_new_destination[1]) /* clobbered */ : ); I have always found using inline-assembly in GCC difficult. I don't know enough to use it correctly. I would appreciate any corrections or advice with regard here of the GCC semantics for writing this code. In particular I'm wondering if I should be marking "memory" as clobbered, as cmpxchg16b will force a full memory barrier. One final question relates to compiler barriers. The more recent __atomic intrinsics do I believe always issue a compiler barrier appropriate to the type of memory order used. However, I think there are only three compiler barriers, load/store/full, so there is not a set of compiler barriers matching the memory orders. I think the "__asm__ __volatile__" will inherently issue a full compiler barrier. Is this correct? The older __sync instructions do not, as I understand it, issue a compiler barrier, and the user must issue them. Although I may be wrong, prior to the __atomic API, I understand a compiler barrier consists of; __asm__ __volatile__ ( "" : : : "memory" ); A full compiler barrier prevents reordering across the barrier. However, the compiler barrier itself is or appears to be a line of code. I seen them on the face of it to need to issue such a barrier both immediately above and immediately below the __sync instruction. Is this correct? I suspect I am here in some way very confused.