On Thu, Mar 26, 2015 at 09:21:50AM -0700, Linus Torvalds wrote: > So the proper patch looks something like this: > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 1b45e4a0519b..f36e1abf56ea 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -198,10 +198,6 @@ __compiletime_warning("data access exceeds > word size and won't be atomic") > #endif > ; You also want to get rid of that ^^^ declaration which includes the compiletime warning. > -static __always_inline void data_access_exceeds_word_size(void) > -{ > -} > - > static __always_inline void __read_once_size(const volatile void > *p, void *res, int size) > { > switch (size) { > @@ -214,7 +210,6 @@ static __always_inline void > __read_once_size(const volatile void *p, void *res, > default: > barrier(); > __builtin_memcpy((void *)res, (const void *)p, size); > - data_access_exceeds_word_size(); > barrier(); > } > } > @@ -231,7 +226,6 @@ static __always_inline void > __write_once_size(volatile void *p, void *res, int s > default: > barrier(); > __builtin_memcpy((void *)p, (const void *)res, size); > - data_access_exceeds_word_size(); > barrier(); > } > } So this still has the potential to generate significantly worse code than the previous ACCESS_ONCE() because of the dual barrier() and god only knows how memcpy gets implemented. As it turns out, the ARM version of __builtin_memcpy() does result in a single double word load because its all smart and such, but a 'trivial' implementation might just end up doing 8 byte copies. Can't we make an argument that these barrier calls are not required? The memcpy() call already guarantees we emit the loads and its opaque so the compiler cannot 'cache' the value. So I see not immediate reason for the dual memory clobber. -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html