On Thu, May 02, 2024 at 11:07:57PM +0100, Al Viro wrote: > On Thu, May 02, 2024 at 02:18:48PM -0700, Paul E. McKenney wrote: > > > If you are only ever doing atomic read-modify-write operations on the > > byte in question, then agreed, you don't care about byte loads and stores. > > > > But there are use cases that do mix smp_store_release() with cmpxchg(), > > and those use cases won't work unless at least byte store is implemented. > > Or I suppose that we could use cmpxchg() instead of smp_store_release(), > > but that is wasteful for architectures that do support byte stores. > > > > So EV56 adds the byte loads and stores needed for those use cases. > > > > Or am I missing your point? > > arch/alpha/include/cmpxchg.h: > #define arch_cmpxchg(ptr, o, n) \ > ({ \ > __typeof__(*(ptr)) __ret; \ > __typeof__(*(ptr)) _o_ = (o); \ > __typeof__(*(ptr)) _n_ = (n); \ > smp_mb(); \ > __ret = (__typeof__(*(ptr))) __cmpxchg((ptr), \ > (unsigned long)_o_, (unsigned long)_n_, sizeof(*(ptr)));\ > smp_mb(); \ > __ret; \ > }) > > Are those smp_mb() in there enough? > > I'm probably missing your point, though - what mix of cmpxchg and > smp_store_release on 8bit values? One of RCU's state machines uses smp_store_release() to start the state machine (only one task gets to do this) and cmpxchg() to update state beyond that point. And the state is 8 bits so that it and other state fits into 32 bits to allow a single check for multiple conditions elsewhere. Thanx, Paul