On Sat, Feb 22, 2014 at 07:30:37PM +0100, Torvald Riegel wrote: > xagsmtp2.20140222183231.5343@xxxxxxxxxxxxxxxxxxxx > X-Xagent-Gateway: emeavsc.vnet.ibm.com (XAGSMTP2 at EMEAVSC) > > On Thu, 2014-02-20 at 10:18 -0800, Paul E. McKenney wrote: > > On Thu, Feb 20, 2014 at 06:26:08PM +0100, Torvald Riegel wrote: > > > xagsmtp2.20140220172700.0416@xxxxxxxxxxxxxxxxxxxx > > > X-Xagent-Gateway: vmsdvm4.vnet.ibm.com (XAGSMTP2 at VMSDVM4) > > > > > > On Wed, 2014-02-19 at 20:01 -0800, Paul E. McKenney wrote: > > > > On Wed, Feb 19, 2014 at 04:53:49PM -0800, Linus Torvalds wrote: > > > > > On Tue, Feb 18, 2014 at 11:47 AM, Torvald Riegel <triegel@xxxxxxxxxx> wrote: > > > > > > On Tue, 2014-02-18 at 09:44 -0800, Linus Torvalds wrote: > > > > > >> > > > > > >> Can you point to it? Because I can find a draft standard, and it sure > > > > > >> as hell does *not* contain any clarity of the model. It has a *lot* of > > > > > >> verbiage, but it's pretty much impossible to actually understand, even > > > > > >> for somebody who really understands memory ordering. > > > > > > > > > > > > http://www.cl.cam.ac.uk/~mjb220/n3132.pdf > > > > > > This has an explanation of the model up front, and then the detailed > > > > > > formulae in Section 6. This is from 2010, and there might have been > > > > > > smaller changes since then, but I'm not aware of any bigger ones. > > > > > > > > > > Ahh, this is different from what others pointed at. Same people, > > > > > similar name, but not the same paper. > > > > > > > > > > I will read this version too, but from reading the other one and the > > > > > standard in parallel and trying to make sense of it, it seems that I > > > > > may have originally misunderstood part of the whole control dependency > > > > > chain. > > > > > > > > > > The fact that the left side of "? :", "&&" and "||" breaks data > > > > > dependencies made me originally think that the standard tried very > > > > > hard to break any control dependencies. Which I felt was insane, when > > > > > then some of the examples literally were about the testing of the > > > > > value of an atomic read. The data dependency matters quite a bit. The > > > > > fact that the other "Mathematical" paper then very much talked about > > > > > consume only in the sense of following a pointer made me think so even > > > > > more. > > > > > > > > > > But reading it some more, I now think that the whole "data dependency" > > > > > logic (which is where the special left-hand side rule of the ternary > > > > > and logical operators come in) are basically an exception to the rule > > > > > that sequence points end up being also meaningful for ordering (ok, so > > > > > C11 seems to have renamed "sequence points" to "sequenced before"). > > > > > > > > > > So while an expression like > > > > > > > > > > atomic_read(p, consume) ? a : b; > > > > > > > > > > doesn't have a data dependency from the atomic read that forces > > > > > serialization, writing > > > > > > > > > > if (atomic_read(p, consume)) > > > > > a; > > > > > else > > > > > b; > > > > > > > > > > the standard *does* imply that the atomic read is "happens-before" wrt > > > > > "a", and I'm hoping that there is no question that the control > > > > > dependency still acts as an ordering point. > > > > > > > > The control dependency should order subsequent stores, at least assuming > > > > that "a" and "b" don't start off with identical stores that the compiler > > > > could pull out of the "if" and merge. The same might also be true for ?: > > > > for all I know. (But see below) > > > > > > I don't think this is quite true. I agree that a conditional store will > > > not be executed speculatively (note that if it would happen in both the > > > then and the else branch, it's not conditional); so, the store in > > > "a;" (assuming it would be a store) won't happen unless the thread can > > > really observe a true value for p. However, this is *this thread's* > > > view of the world, but not guaranteed to constrain how any other thread > > > sees the state. mo_consume does not contribute to > > > inter-thread-happens-before in the same way that mo_acquire does (which > > > *does* put a constraint on i-t-h-b, and thus enforces a global > > > constraint that all threads have to respect). > > > > > > Is it clear which distinction I'm trying to show here? > > > > If you are saying that the control dependencies are a result of a > > combination of the standard and the properties of the hardware that > > Linux runs on, I am with you. (As opposed to control dependencies being > > a result solely of the standard.) > > I'm not quite sure I understand what you mean :) Do you mean the > control dependencies in the binary code, or the logical "control > dependencies" in source programs? At present, the intersection of those two sets, but only including those control dependencies beginning with with a memory_order_consume load or a [[carries_dependency]] function argument or return value. Or something like that. ;-) > > This was a deliberate decision in 2007 or so. At that time, the > > documentation on CPU memory orderings were pretty crude, and it was > > not clear that all relevant hardware respected control dependencies. > > Back then, if you wanted an authoritative answer even to a fairly simple > > memory-ordering question, you had to find a hardware architect, and you > > probably waited weeks or even months for the answer. Thanks to lots > > of work from the Cambridge guys at about the time that the standard was > > finalized, we have a much better picture of what the hardware does. > > But this part I understand. ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html