On Fri, Mar 3, 2017 at 9:26 AM, Christian Borntraeger <borntraeger@xxxxxxxxxx> wrote: > On 03/02/2017 10:45 PM, Arnd Bergmann wrote: >> Ok, got it. So I guess the behavior of forcing aligned accesses on aligned >> data is accidental, and allowing non-power-of-two arguments is also not >> the main purpose. > > > Right. The main purpose is to read/write _ONCE_. You can assume a somewhat > atomic access for sizes <= word size. And there are certainly places that > rely on that. But the *ONCE thing is mostly used for things where we used > barrier() 10 years ago. Ok > > Maybe we could just bail out on new compilers if we get >> either of those? That might catch code that accidentally does something >> that is inherently non-atomic or that causes a trap when the intention was >> to have a simple atomic access. > > I think Linus stated that its ok to assume that the compiler is smart enough > to uses a single instruction to access aligned and properly sized scalar types > for *ONCE. > > Back then when I changed ACCESS_ONCE there were many places that did use it > for non-atomic, > word size accesses. For example on some architectures a pmd_t > is a typedef to an array, for which there is no way to read that atomically. > So the focus must be on the "ONCE" part. > > If some code uses a properly aligned, word sized object we can also assume > atomic access. If the access is not properly sized/aligned we do not get > atomicity, but we do get the "ONCE". > But adding a check for alignment/size would break the compilation of some > code. So what should be the expected behavior for objects that have a smaller alignment? E.g. this structure struct fourbytes { char bytes[4]; } __packed; when passed into the current READ_ONCE() will be accessed with a 32-bit load, while reading it with struct fourbytes local = *(volatile struct fourbytes *)voidpointer; on architectures like ARMv5 or lower will turn into four single-byte reads to avoid an alignment trap when the pointer is actually unaligned. I can see arguments for and against either behavior, but what should I do when modifying it for newer compilers? The possible options that I see are - keep assuming that the pointer will be aligned at runtime and doesn't trap - use the regular gcc behavior and do byte-accesses on those architectures that otherwise might trap - add a runtime alignment check to do atomic accesses whenever possible, but never trap - fail the build Arnd