On Fri, Jul 16, 2021 at 12:19:29PM +0200, Peter Zijlstra wrote: > On Fri, Jul 16, 2021 at 05:02:33PM +0800, Hou Tao wrote: > > > > Cachelines don't guarantee anything, you can get partial forwards. > > > > Could you please point me to any reference ? I can not google > > > > any memory order things by using "partial forwards". > > I'm not sure I have references, but there are CPUs that can do, for > example, store forwarding at a granularity below cachelines (ie at > register size). > > In such a case a CPU might observe the stored value before it is > committed to memory. There have been examples of systems with multiple hardware threads per core, but where the hardware threads share a store buffer. In this case, the other threads in the core might see a store before it is committed to a cache line. As you might imagine, the hardware implementation of memory barriers in such a system is tricky. But not impossible. Thanx, Paul