On Mon, Jun 03, 2024 at 05:22:22PM +0100, Maciej W. Rozycki wrote: > On Fri, 31 May 2024, Paul E. McKenney wrote: > > > > You're the RCU expert so you know the answer. I don't. If it's OK for > > > successive writes to get reordered, or readers to see a stale value, then > > > you don't need memory barriers. Otherwise you do. Whether byte accesses > > > are available or not does not matter, the CPU *will* do reordering if it's > > > allowed to (or more specifically, it won't do anything to prevent it from > > > happening, especially in SMP configurations; I can't remember offhand if > > > there are cases with UP). Also adjacent byte writes may be merged, but I > > > suppose it does not matter, or does it? > > > > RCU uses whichever wrapper is required. For example, if ordering is > > required, it might use smp_store_release() and smp_load_acquire(). > > If ordering does not matter, it might use WRITE_ONCE() and READ_ONCE(). > > If tearing/fusing/merging does not matter, as in there are not concurrent > > accesses, it uses plain C-language loads and stores. > > Fair enough. > > > > NB MIPS has similar architectural arrangements (and a bunch of barriers > > > defined in the ISA), it's just most implementations are actually strongly > > > ordered, so most people can't see the effects of this. With MIPS I know > > > for sure there are cases of UP reordering, but they only really matter for > > > MMIO use cases and not regular memory. > > > > Any given architecture is required to provide architecture-specific > > implementations of the various functions that meet the requirements of > > Linux-kernel memory model. See tools/memory-model for more information. > > This is a fairly recent addition, thank you for putting it all together. > I used to rely solely on Documentation/memory-barriers.txt. Thanks for > the reference. It has been in the kernel since April 2018, but OK. And a big "thank you" to all the people who made this possible and who continue contributing to it. And Documentation/memory-barriers.txt still matters, though the long-term goal is for it to be subsumed into tools/memory-model. Things like compiler optimizations make this difficult, but not impossible. Another precaution is to ensure that any contraints of a non-common-case architecture be tested for. For example, if I add a 64-bit divide, I get yelled at promptly. In contrast, that long list of byte accesses that Arnd posted were suffered in silence. So they accumulated well past the point where they can reasonably be backed out. Thanx, Paul