On Fri, Nov 24, 2023 at 11:15:19AM +0100, Peter Zijlstra wrote: > On Fri, Nov 24, 2023 at 08:21:37AM +0100, Christoph Muellner wrote: > > From: Christoph Müllner <christoph.muellner@xxxxxxxx> > > > > The upcoming RISC-V Ssdtso specification introduces a bit in the senvcfg > > CSR to switch the memory consistency model at run-time from RVWMO to TSO > > (and back). The active consistency model can therefore be switched on a > > per-hart base and managed by the kernel on a per-process/thread base. > > You guys, computers are hartless, nobody told ya? > > > This patch implements basic Ssdtso support and adds a prctl API on top > > so that user-space processes can switch to a stronger memory consistency > > model (than the kernel was written for) at run-time. > > > > I am not sure if other architectures support switching the memory > > consistency model at run-time, but designing the prctl API in an > > arch-independent way allows reusing it in the future. > > IIRC some Sparc chips could do this, but I don't think anybody ever > exposed this to userspace (or used it much). > > IA64 had planned to do this, except they messed it up and did it the > wrong way around (strong first and then relax it later), which lead to > the discovery that all existing software broke (d'uh). > > I think ARM64 approached this problem by adding the > load-acquire/store-release instructions and for TSO based code, > translate into those (eg. x86 -> arm64 transpilers). Keeping global TSO order is easier and faster than mixing acquire/release and regular load/store. That means when ssdtso is enabled, the transpiler's load-acquire/store-release becomes regular load/store. Some micro-arch hardwares could speed up the performance. Of course, you may say powerful machines could smooth out the difference between ssdtso & load-acquire/store-release, but that's not real life. Adding ssdtso is a flexible way to gain more choices on the cost of chip design. > > IIRC Risc-V actually has such instructions as well, so *why* are you > doing this?!?! >