On 9/9/2021 9:35 AM, Will Deacon wrote: > [+Palmer, PaulW, Daniel and Michael] > > On Thu, Sep 09, 2021 at 09:25:30AM +0200, Peter Zijlstra wrote: >> On Wed, Sep 08, 2021 at 09:08:33AM -0700, Linus Torvalds wrote: >> >>> So if this is purely a RISC-V thing, >> >> Just to clarify, I think the current RISC-V thing is stonger than >> PowerPC, but maybe not as strong as say ARM64, but RISC-V memory >> ordering is still somewhat hazy to me. >> >> Specifically, the sequence: >> >> /* critical section s */ >> WRITE_ONCE(x, 1); >> FENCE RW, W >> WRITE_ONCE(s.lock, 0); /* store S */ >> AMOSWAP %0, 1, r.lock /* store R */ >> FENCE R, RW >> WRITE_ONCE(y, 1); >> /* critical section r */ >> >> fully separates section s from section r, as in RW->RW ordering >> (possibly not as strong as smp_mb() though), while on PowerPC it would >> only impose TSO ordering between sections. >> >> The AMOSWAP is a RmW and as such matches the W from the RW->W fence, >> similarly it marches the R from the R->RW fence, yielding an: >> >> RW-> W >> RmW >> R ->RW >> >> ordering. It's the stores S and R that can be re-ordered, but not the >> sections themselves (same on PowerPC and many others). >> >> Clarification from a RISC-V enabled person would be appreciated. To first order, RISC-V's memory model is very similar to ARMv8's. It is "other-multi-copy-atomic", unlike Power, and respects dependencies. It also has AMOs and LR/SC with optional RCsc acquire or release semantics. There's no need to worry about RISC-V somehow pushing the boundaries of weak memory ordering in new ways. The tricky part is that unlike ARMv8, RISC-V doesn't have load-acquire or store-release opcodes at all. Only AMOs and LR/SC have acquire or release options. That means that while certain operations like swap can be implemented with native RCsc semantics, others like store-release have to fall back on fences and plain writes. That's where the complexity came up last time this was discussed, at least as it relates to RISC-V: how to make sure the combination of RCsc atomics and plain operations+fences gives the semantics everyone is asking for here. And to be clear there, I'm not asking for LKMM to weaken anything about critical section ordering just for RISC-V's sake. TSO/RCsc ordering between critical sections is a perfectly reasonable model in my opinion. I just want to make sure RISC-V gets it right given whatever the decision is. >>> then I think it's entirely reasonable to >>> >>> spin_unlock(&r); >>> spin_lock(&s); >>> >>> cannot be reordered. >> >> I'm obviously completely in favour of that :-) > > I don't think we should require the accesses to the actual lockwords to > be ordered here, as it becomes pretty onerous for relaxed LL/SC > architectures where you'd end up with an extra barrier either after the > unlock() or before the lock() operation. However, I remain absolutely in > favour of strengthening the ordering of the _critical sections_ guarded by > the locks to be RCsc. I agree with Will here. If the AMOSWAP above is actually implemented with a RISC-V AMO, then the two critical sections will be separated as if RW,RW, as Peter described. If instead it's implemented using LR/SC, then RISC-V gives only TSO (R->R, R->W, W->W), because the two pieces of the AMO are split, and that breaks the chain. Getting full RW->RW between the critical sections would therefore require an extra fence. Also, the accesses to the lockwords themselves would not be ordered without an extra fence. > Last time this came up, I think the RISC-V folks were generally happy to > implement whatever was necessary for Linux [1]. The thing that was stopping > us was Power (see CONFIG_ARCH_WEAK_RELEASE_ACQUIRE), wasn't it? I think > Michael saw quite a bit of variety in the impact on benchmarks [2] across > different machines. So the question is whether newer Power machines are less > affected to the degree that we could consider making this change again. Yes, as I said above, RISC-V will implement what is needed to make this work. Dan > Will > > [1] https://lore.kernel.org/lkml/11b27d32-4a8a-3f84-0f25-723095ef1076@xxxxxxxxxx/ > [2] https://lore.kernel.org/lkml/87tvp3xonl.fsf@xxxxxxxxxxxxxxxxxxxxxxxx/
![]() |