Re: XDP socket rings, and LKMM litmus tests

Boqun Feng <boqun.feng@xxxxxxxxx> · Fri, 5 Mar 2021 09:12:30 +0800

On Thu, Mar 04, 2021 at 11:11:42AM -0500, Alan Stern wrote:
> On Thu, Mar 04, 2021 at 02:33:32PM +0800, Boqun Feng wrote:
> 
> > Right, I was thinking about something unrelated.. but how about the
> > following case:
> > 
> > 	local_v = &y;
> > 	r1 = READ_ONCE(*x); // f
> > 
> > 	if (r1 == 1) {
> > 		local_v = &y; // e
> > 	} else {
> > 		local_v = &z; // d
> > 	}
> > 
> > 	p = READ_ONCE(local_v); // g
> > 
> > 	r2 = READ_ONCE(*p);   // h
> > 
> > if r1 == 1, we definitely think we have:
> > 
> > 	f ->ctrl e ->rfi g ->addr h
> > 
> > , and if we treat ctrl;rfi as "to-r", then we have "f" happens before
> > "h". However compile can optimze the above as:
> > 
> > 	local_v = &y;
> > 
> > 	r1 = READ_ONCE(*x); // f
> > 
> > 	if (r1 != 1) {
> > 		local_v = &z; // d
> > 	}
> > 
> > 	p = READ_ONCE(local_v); // g
> > 
> > 	r2 = READ_ONCE(*p);   // h
> > 
> > , and when this gets executed, I don't think we have the guarantee we
> > have "f" happens before "h", because CPU can do optimistic read for "g"
> > and "h".
> 
> In your example, which accesses are supposed to be to actual memory and 
> which to registers?  Also, remember that the memory model assumes the 

Given that we use READ_ONCE() on local_v, local_v should be a memory
location but only accessed by this thread.

> hardware does not reorder loads if there is an address dependency 
> between them.
> 

Right, so "g" won't be reordered after "h".

> > Part of this is because when we take plain access into consideration, we
> > won't guarantee a read-from or other relations exists if compiler
> > optimization happens.
> > 
> > Maybe I'm missing something subtle, but just try to think through the
> > effect of making dep; rfi as "to-r".
> 
> Forget about local variables for the time being and just consider
> 
> 	dep ; [Plain] ; rfi
> 
> For example:
> 
> 	A: r1 = READ_ONCE(x);
> 	   y = r1;
> 	B: r2 = READ_ONCE(y);
> 
> Should B be ordered after A?  I don't see how any CPU could hope to 
> excute B before A, but maybe I'm missing something.
> 

Agreed.

> There's another twist, connected with the fact that herd7 can't detect 
> control dependencies caused by unexecuted code.  If we have:
> 
> 	A: r1 = READ_ONCE(x);
> 	if (r1)
> 		WRITE_ONCE(y, 5);
> 	r2 = READ_ONCE(y);
> 	B: WRITE_ONCE(z, r2);
> 
> then in executions where x == 0, herd7 doesn't see any control 
> dependency.  But CPUs do see control dependencies whenever there is a 
> conditional branch, whether the branch is taken or not, and so they will 
> never reorder B before A.
> 

Right, because B in this example is a write, what if B is a read that
depends on r2, like in my example? Let y be a pointer to a memory
location, and initialized as a valid value (pointing to a valid memory
location) you example changed to:

	A: r1 = READ_ONCE(x);
	if (r1)
		WRITE_ONCE(y, 5);
	C: r2 = READ_ONCE(y);
	B: r3 = READ_ONCE(*r2);

, then A don't have the control dependency to B, because A and B is
read+read. So B can be ordered before A, right?

> One last thing to think about: My original assessment or Björn's problem 
> wasn't right, because the dep in (dep ; rfi) doesn't include control 
> dependencies.  Only data and address.  So I believe that the LKMM 

Ah, right. I was mising that part (ctrl is not in dep). So I guess my
example is pointless for the question we are discussing here ;-(

> wouldn't consider A to be ordered before B in this example even if x 
> was nonzero.

Yes, and similar to my example (changing B to a read).

I did try to run my example with herd, and got confused no matter I make
dep; [Plain]; rfi as to-r (I got the same result telling me a reorder
can happen). Now the reason is clear, because this is a ctrl; rfi not a
dep; rfi.

Thanks so much for walking with me on this ;-)

Regards,
Boqun

> 
> Alan