On Thu, Oct 26, 2023 at 07:21:53AM -0700, Boqun Feng wrote: > On Thu, Oct 26, 2023 at 01:16:25PM +0200, Peter Zijlstra wrote: > > On Thu, Oct 26, 2023 at 11:36:10AM +0100, Gary Guo wrote: > > > > > There's two reasons that we are using volatile read/write as opposed to > > > relaxed atomic: > > > * Rust lacks volatile atomics at the moment. Non-volatile atomics are > > > not sufficient because the compiler is allowed (although they > > > currently don't) optimise atomics. If you have two adjacent relaxed > > > loads, they could be merged into one. > > > > Ah yes, that would be problematic, eg, if lifted out of a loop things > > could go sideways fast. > > > > Maybe we can workaround this limitation by using compiler barriers, i.e. > > compiler_fence(SeqCst); > load(Relaxed); > compiler_fence(Acquire); > > this is slightly stronger than a volatile atomic. > > > > * Atomics only works for integer types determined by the platform. On > > > some 32-bit platforms you wouldn't be able to use 64-bit atomics at > > > all, and on x86 you get less optimal sequence since volatile load is > > > permitted to tear while atomic load needs to use LOCK CMPXCHG8B. > > > > We only grudgingly allowed u64 READ_ONCE() on 32bit platforms because > > the fallout was too numerous to fix. Some of them are probably bugs. > > > > Also, I think cmpxchg8b without lock prefix would be sufficient, but > > I've got too much of a head-ache to be sure. Worse is that we still > > support targets without cmpxchg8b. > > > > It might be interesting to make the Rust side more strict in this regard > > and see where/when we run into trouble. > > > > Sounds good to me. If the compiler barriers make sense for now, then > we can do: > > pub unsafe fn read_once_usize(ptr: *const usize) -> usize { > core::sync::atomic::compiler_fence(SeqCst); > let r = unsafe { *ptr.cast::<AtomicUsize>() }.load(Relaxed); > core::sync::atomic::compiler_fence(Acquire); > r > } > I forgot to mention, this can also resolve the comments from Marco, i.e. switching implemention to Acquire if ARM64 & LTO. Regards, Boqun > and if the other side (i.e. write) is also atomic (e.g. WRITE_ONCE()), > we don't have data race. > > However, there are still cases where data races are ignored in C code, > for example inode::i_state: reads out of locks race with writes inside > locks, since writes are done by plain accesses. Nothing can be done to > fix that from Rust side only, and fixing the C side is a separate topic. > > Thoughts? > > Regards, > Boqun > > > > * Atomics doesn't work for complex structs. Although I am not quite sure > > > of the value of supporting it. > > > > So on the C side we mandate the size is no larger than machine word, > > with the exception of the u64 on 32bit thing. We don't mandate strict > > integer types because things like pte_t are wrapper types. > > > >