On Tue, Feb 18, 2014 at 04:56:40PM +0100, Torvald Riegel wrote: > On Mon, 2014-02-17 at 19:00 -0800, Paul E. McKenney wrote: > > On Mon, Feb 17, 2014 at 12:18:21PM -0800, Linus Torvalds wrote: > > > On Mon, Feb 17, 2014 at 11:55 AM, Torvald Riegel <triegel@xxxxxxxxxx> wrote: > > > > > > > > Which example do you have in mind here? Haven't we resolved all the > > > > debated examples, or did I miss any? > > > > > > Well, Paul seems to still think that the standard possibly allows > > > speculative writes or possibly value speculation in ways that break > > > the hardware-guaranteed orderings. > > > > It is not that I know of any specific problems, but rather that I > > know I haven't looked under all the rocks. Plus my impression from > > my few years on the committee is that the standard will be pushed to > > the limit when it comes time to add optimizations. > > > > One example that I learned about last week uses the branch-prediction > > hardware to validate value speculation. And no, I am not at all a fan > > of value speculation, in case you were curious. However, it is still > > an educational example. > > > > This is where you start: > > > > p = gp.load_explicit(memory_order_consume); /* AKA rcu_dereference() */ > > do_something(p->a, p->b, p->c); > > p->d = 1; > > I assume that's the source code. Yep! > > Then you leverage branch-prediction hardware as follows: > > > > p = gp.load_explicit(memory_order_consume); /* AKA rcu_dereference() */ > > if (p == GUESS) { > > do_something(GUESS->a, GUESS->b, GUESS->c); > > GUESS->d = 1; > > } else { > > do_something(p->a, p->b, p->c); > > p->d = 1; > > } > > I assume that this is a potential transformation by a compiler. Again, yep! > > The CPU's branch-prediction hardware squashes speculation in the case where > > the guess was wrong, and this prevents the speculative store to ->d from > > ever being visible. However, the then-clause breaks dependencies, which > > means that the loads -could- be speculated, so that do_something() gets > > passed pre-initialization values. > > > > Now, I hope and expect that the wording in the standard about dependency > > ordering prohibits this sort of thing. But I do not yet know for certain. > > The transformation would be incorrect. p->a in the source code carries > a dependency, and as you say, the transformed code wouldn't have that > dependency any more. So the transformed code would loose ordering > constraints that it has in the virtual machine, so in the absence of > other proofs of correctness based on properties not shown in the > example, the transformed code would not result in the same behavior as > allowed by the abstract machine. Glad that you agree! ;-) > If the transformation would actually be by a programmer, then this > wouldn't do the same as the first example because mo_consume doesn't > work through the if statement. Agreed. > Are there other specified concerns that you have regarding this example? Nope. Just generalized paranoia. (But just because I am paranoid doesn't mean that there isn't a bug lurking somewhere in the standard, the compiler, the kernel, or my own head!) I will likely have more once I start mapping Linux kernel atomics to the C11 standard. One more paper past N3934 comes first, though. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html