On 3/21/2018 9:35 AM, David Laight wrote: >> x86 has compiler barrier inside the relaxed() API so that code does not >> get reordered. ARM64 architecturally guarantees device writes to be observed >> in order. > > There are places where you don't even need a compile barrier between > every write. > > I had horrid problems getting some ppc code (for a specific embedded SoC) > optimised to have no extra barriers. > I ended up just writing through 'pointer to volatile' and adding an > explicit 'eieio' between the block of writes and status read. > > No less painful was doing a byteswapping write to normal memory. If the architecture is reordering writes to the peripheral, then removing the compiler barrier can break the multi-arch drivers. barriers document clearly states that device need to observe writes in order. Though for special cases like you mentioned, you can certainly do this: wmb() __raw_write/pointer access __raw_write/pointer access __raw_write/pointer access /* flush everything */ mmiowb() __raw_write/pointer access There would be no ordering guarantee between the wmb() and mmiowb(). This can only be done for known code and known hardware. I don't believe this applies to multi-arch drivers. > > David > -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html