On Thu, Mar 04, 2021 at 02:33:32PM +0800, Boqun Feng wrote: > Right, I was thinking about something unrelated.. but how about the > following case: > > local_v = &y; > r1 = READ_ONCE(*x); // f > > if (r1 == 1) { > local_v = &y; // e > } else { > local_v = &z; // d > } > > p = READ_ONCE(local_v); // g > > r2 = READ_ONCE(*p); // h > > if r1 == 1, we definitely think we have: > > f ->ctrl e ->rfi g ->addr h > > , and if we treat ctrl;rfi as "to-r", then we have "f" happens before > "h". However compile can optimze the above as: > > local_v = &y; > > r1 = READ_ONCE(*x); // f > > if (r1 != 1) { > local_v = &z; // d > } > > p = READ_ONCE(local_v); // g > > r2 = READ_ONCE(*p); // h > > , and when this gets executed, I don't think we have the guarantee we > have "f" happens before "h", because CPU can do optimistic read for "g" > and "h". In your example, which accesses are supposed to be to actual memory and which to registers? Also, remember that the memory model assumes the hardware does not reorder loads if there is an address dependency between them. > Part of this is because when we take plain access into consideration, we > won't guarantee a read-from or other relations exists if compiler > optimization happens. > > Maybe I'm missing something subtle, but just try to think through the > effect of making dep; rfi as "to-r". Forget about local variables for the time being and just consider dep ; [Plain] ; rfi For example: A: r1 = READ_ONCE(x); y = r1; B: r2 = READ_ONCE(y); Should B be ordered after A? I don't see how any CPU could hope to excute B before A, but maybe I'm missing something. There's another twist, connected with the fact that herd7 can't detect control dependencies caused by unexecuted code. If we have: A: r1 = READ_ONCE(x); if (r1) WRITE_ONCE(y, 5); r2 = READ_ONCE(y); B: WRITE_ONCE(z, r2); then in executions where x == 0, herd7 doesn't see any control dependency. But CPUs do see control dependencies whenever there is a conditional branch, whether the branch is taken or not, and so they will never reorder B before A. One last thing to think about: My original assessment or Björn's problem wasn't right, because the dep in (dep ; rfi) doesn't include control dependencies. Only data and address. So I believe that the LKMM wouldn't consider A to be ordered before B in this example even if x was nonzero. Alan