On Thu, 26 Oct 2023 10:13:45 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > On Wed, Oct 25, 2023 at 12:53:39PM -0700, Boqun Feng wrote: > > In theory, `read_volatile` and `write_volatile` in Rust can have UB in > > case of the data races [1]. However, kernel uses volatiles to implement > > READ_ONCE() and WRITE_ONCE(), and expects races on these marked accesses > > don't cause UB. And they are proven to have a lot of usages in kernel. > > > > To close this gap, `read_once` and `write_once` are introduced, they > > have the same semantics as `READ_ONCE` and `WRITE_ONCE` especially > > regarding data races under the assumption that `read_volatile` and > > `write_volatile` have the same behavior as a volatile pointer in C from > > a compiler point of view. > > > > Longer term solution is to work with Rust language side for a better way > > to implement `read_once` and `write_once`. But so far, it should be good > > enough. > > So the whole READ_ONCE()/WRITE_ONCE() thing does two things we care > about (AFAIR): > > - single-copy-atomicy; this can also be achieved using the C11 > __atomic_load_n(.memorder=__ATOMIC_RELAXED) / > __atomic_store_n(.memorder=__ATOMIC_RELAXED) thingies. > > - the ONCE thing; that is inhibits re-materialization, and here I'm not > sure C11 atomics help, they might since re-reading an atomic is > definitely dodgy -- after all it could've changed. > > Now, traditionally we've relied on the whole volatile thing simply > because there was no C11, or our oldest compiler didn't do C11. But > these days we actually *could*. > > Now, obviously C11 has issues vs LKMM, but perhaps the load/store > semantics are near enough to be useful. (IIRC this also came up in the > *very* long x86/percpu thread) > > So is there any distinction between the volatile load/store and the C11 > atomic load/store that we care about and could not Rust use the atomic > load/store to avoid their UB ? There's two reasons that we are using volatile read/write as opposed to relaxed atomic: * Rust lacks volatile atomics at the moment. Non-volatile atomics are not sufficient because the compiler is allowed (although they currently don't) optimise atomics. If you have two adjacent relaxed loads, they could be merged into one. * Atomics only works for integer types determined by the platform. On some 32-bit platforms you wouldn't be able to use 64-bit atomics at all, and on x86 you get less optimal sequence since volatile load is permitted to tear while atomic load needs to use LOCK CMPXCHG8B. * Atomics doesn't work for complex structs. Although I am not quite sure of the value of supporting it.