On Thu, 29 Jun 2017, Alexander Monakov wrote: > (but of course that's a nontrivial amount of work for a somewhat "academic" > issue) I think a practical approach is to give the user a degree of control by introducing a tri-state compiler option controlling how double-word atomics are to be emitted: - -m128-bit-atomics=libcalls this is the current behavior, all doubleword atomics become libatomic API calls; I don't imagine anyone really likes that; - -m128-bit-atomics=cas+loads all doubleword atomics, including loads, become cas/ll-sc loops; atomic loads from readonly memory may trap, but the user may know that their code never encounters that situation (e.g. if it's in control of how atomic objects are allocated, normally on stack or on heap); it would be binary-incompatible with the above option, but, again, acceptable in many situations where the atomic objects are not exposed to potentially-incompatible code; I think this is what users would actually want in practice; - -m128-bit-atomics=cas-loads all doubleword atomics, *exluding loads*, become cas/ll-sc loops; atomic loads become calls; in the end the user gets a link error and needs to decide how to handle it: - either adjust the program such that no plain loads remain, or - resolve the link error by providing a definition that works via cas, if they know that their accesses won't trap (could be with a static inline function, so in the end there's no overhead), - use the solution with kernel+vdso assist from the previous mail :) this might eventually become the default compiler behavior. Alexander