Re: how to understand cpu in-order commit

"laokz" <laokz@xxxxxxxxxxx> · Sat, 30 May 2020 18:36:37 +0800

Hello Paul,

This is a bit longer story, I am still searching and stuck in the mist:-)
Hope to get light from you. Thanks!

I commented out smb_mb() from tools/memory-model/litmus-
tests/LB+fencembonceonce+ctrlonceonce.litmus.

P0(int *x, int *y)
{
	int r0;

	r0 = READ_ONCE(*x);
	if (r0)
		WRITE_ONCE(*y, 1);
}

P1(int *x, int *y)
{
	int r0;

	r0 = READ_ONCE(*y);
//	smp_mb();
	WRITE_ONCE(*x, 1);
}

And confused by that LKMM said it existed P0:r0=1 /\ P1:r0=1

I want to clear these questions:

1. Is there 'out-of-order commit/retirement' CPU among linux supported
architectures? If yes, which one? and then the following is trivial.

2. READ_ONCE, WRITE_ONCE assure compiler respect program order.
If P0:r0=1, then it must have observed P1 write to x(wall time ahead P0:r0).
If P1 write to x happened(committed, so visible to outside), then its read
from y must happened before, because cpu's in-order commit/retirement
restriction(wall time ahead P1:write). 
Then how the most earlier P1:r0 to get value 1?

Thanks again,
laokz

On 2019-11-19 Tue 13:11 -0800，Paul E. McKenney wrote：
> On Tue, Nov 19, 2019 at 11:45:03PM +0800, laokz wrote:
> > Hello paul,
> > 
> > 在 2019-11-19二的 06:44 -0800，Paul E. McKenney写道：
> > > On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> > > > But wonder how about cpu in-order commit. So I removed smp_mb(), the
> > > > litmus
> > > > test showed the result exist! This confused me. In my understanding,
> > > > cpu
> > > > in-
> > > > order commit means out-of-order-execution results commit to
> > > > register/memory
> > > > in compiled program order. That is cpu P1 must retire r0=y first
> > > > then
> > > > x=1,
> > > > thus P0 can see P1's update of x. So P1's r0 should never be 1.
> > > > 
> > > > Is this caused by LKMM's compatibility with out-of-order commit
> > > > architectures? Or what's wrong with me?
> > > 
> > > Nothing is wrong with you.  You are just going through a common phase
> > > in learning about memory models.  ;-)
> > > 
> > > So you modified P1() as follows, correct?
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		r0 = READ_ONCE(*y);
> > > 		WRITE_ONCE(*x, 1);
> > > 	}
> > > 
> > > The compiler is free to rearrange this code as follows:
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		WRITE_ONCE(*x, 1);
> > > 		r0 = READ_ONCE(*y);
> > > 	}
> > > 
> > > This can clearly satisfy the exists clause:  P1() does its write,
> > > P0() does its read and its write, and finally P1() does its read.
> > 
> > Compiler is one big deal. It seems that seldom compilers obey C11
> > standard strictly. "Accesses to volatile objects are evaluated strictly
> > according to the rules of the abstract machine." at least means they
> > should
> > not change the sequence point.
> 
> Yes, you are right, the volatile nature of READ_ONCE() and WRITE_ONCE()
> would prohibit the reordering above.  On the other hand, the C++11
> standard really does allow relaxed atomic loads and stores to be
> reordered.  And since I was at a C++ standards committee a few weeks
> ago, I had relaxed atomics on my brain.  Apologies for my confusion.
> 
> > > But suppose we prevented the compiler from moving the code:
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		r0 = READ_ONCE(*y);
> > > 		barrier();
> > > 		WRITE_ONCE(*x, 1);
> > > 	}
> > > 
> > > Then, as you say, weakly ordered CPUs might still reorder P1()'s
> > > read and write.  So LKMM must still say that the exists clause is
> > > satisfied.
> > 
> > Legacy resource is another big deal.
> 
> I will let you argue the "Legacy resource" point with the vendors still
> selling weakly ordered CPUs.  ;-)
> 
> > Thanks your quick reply. It really clears my head.
> 
> ;-) ;-) ;-)
> 
> 							Thanx, Paul
>