On Sat, May 11, 2024 at 10:08:50PM +0200, Arnd Bergmann wrote: > On Sat, May 11, 2024, at 21:37, Paul E. McKenney wrote: > > On Sat, May 11, 2024 at 08:49:08PM +0200, John Paul Adrian Glaubitz wrote: > > > > The pre-EV56 Alphas have no byte store instruction, correct? > > > > If that is in fact correct, what code is generated for a volatile store > > to a single byte for those CPUs? For example, for this example? > > > > char c; > > > > ... > > > > WRITE_ONCE(c, 3); > > > > The rumor I heard is that the compilers will generate a non-atomic > > read-modify-write instruction sequence in this case, first reading the > > 32-bit word containing that byte into a register, then substituting the > > value to be stored into corresponding byte of that register, and finally > > doing a 32-bit store from that register. > > > > Is that the case, or am I confused? > > I think it's slightly worse: gcc will actually do a 64-bit > read-modify-write rather than a 32-bit one, and it doesn't > use atomic ll/sc when storing into an _Atomic struct member: > > echo '#include <stdatomic.h>^M struct s { _Atomic char c; _Atomic char s[7]; long l; }; void f(struct s *s) { atomic_store(&s->c, 3); }' | alpha-linux-gcc-14 -xc - -S -o- -O2 -mcpu=ev5 > > f: > .frame $30,0,$26,0 > $LFB0: > .cfi_startproc > .prologue 0 > mb > lda $1,3($31) > insbl $1,$16,$1 > ldq_u $2,0($16) > mskbl $2,$16,$2 > bis $1,$2,$1 > stq_u $1,0($16) > bis $31,$31,$31 > mb > ret $31,($26),1 > .cfi_endproc > $LFE0: > .end f > > compared to -mcpu=ev56: > > f: > .frame $30,0,$26,0 > $LFB0: > .cfi_startproc > .prologue 0 > mb > lda $1,3($31) > stb $1,0($16) > bis $31,$31,$31 > mb > ret $31,($26),1 > .cfi_endproc > $LFE0: > .end f Thank you, Arnd! And that breaks things because it can clobber concurrent stores to other bytes in that enclosing machine word. Thanx, Paul