On 28/06/17 08:33, Toebs Douglass wrote: > On 28/06/17 00:53, Alexander Monakov wrote: >> On Tue, 27 Jun 2017, Richard Earnshaw wrote: >>> Only *seems* to work. The ldxp operation is *NOT* atomic and atomicity >>> is only guaranteed if a store back of the original read value completes. >>> >>> In general this makes ldxp not useful for atomic operations since we >>> cannot guarantee that the location is writable. >> >> But the original question was not about atomic reads, it was about atomic CAS, >> so obviously the location is going to written. Exactly. AArch64 does indeed have SWAP_16. >> It seems in armv8.1a there's also a separate doubleword cas instruction, CASP? >> It appears to be implemented in Binutils: https://sourceware.org/ml/binutils/2014-09/msg00021.html >> but not in GCC. > > Yes. I wonder why ARM did that. Unless the internal implementation is > different (no ERG) ERG? > then it seems just a wrapper for an LL/SC loop, and for me it's hard > to imagine that being worth an instruction. It avoids ping-ponging between cache lines on a highly-contended large system. A CAS can be performed remotely from the processor that is executing the instruction, without moving the data into its cache. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671