On Fri, 2014-02-07 at 18:06 +0100, Peter Zijlstra wrote: > On Fri, Feb 07, 2014 at 04:55:48PM +0000, Will Deacon wrote: > > Hi Paul, > > > > On Fri, Feb 07, 2014 at 04:50:28PM +0000, Paul E. McKenney wrote: > > > On Fri, Feb 07, 2014 at 08:44:05AM +0100, Peter Zijlstra wrote: > > > > On Thu, Feb 06, 2014 at 08:20:51PM -0800, Paul E. McKenney wrote: > > > > > Hopefully some discussion of out-of-thin-air values as well. > > > > > > > > Yes, absolutely shoot store speculation in the head already. Then drive > > > > a wooden stake through its hart. > > > > > > > > C11/C++11 should not be allowed to claim itself a memory model until that > > > > is sorted. > > > > > > There actually is a proposal being put forward, but it might not make ARM > > > and Power people happy because it involves adding a compare, a branch, > > > and an ISB/isync after every relaxed load... Me, I agree with you, > > > much preferring the no-store-speculation approach. > > > > Can you elaborate a bit on this please? We don't permit speculative stores > > in the ARM architecture, so it seems counter-intuitive that GCC needs to > > emit any additional instructions to prevent that from happening. > > > > Stores can, of course, be observed out-of-order but that's a lot more > > reasonable :) > > This is more about the compiler speculating on stores; imagine: > > if (x) > y = 1; > else > y = 2; > > The compiler is allowed to change that into: > > y = 2; > if (x) > y = 1; If you write the example like that, this is indeed allowed because it's all sequential code (and there's no volatiles in there, at least you didn't show them :). A store to y would happen in either case. You cannot observe the difference between both examples in a data-race-free program. Are there supposed to be atomic/non-sequential accesses in there? If so, please update the example. > Which is of course a big problem when you want to rely on the ordering. > > There's further problems where things like memset() can write outside > the specified address range. Examples are memset() using single > instructions to wipe entire cachelines and then 'restoring' the tail > bit. As Joseph said, this would be a bug IMO. > While valid for single threaded, its a complete disaster for concurrent > code. > > There's more, but it all boils down to doing stores you don't expect in > a 'sane' concurrent environment and/or don't respect the control flow. A few of those got fixed already, because they violated the memory model's requirements. If you have further examples that are valid code in the C11/C++11 model, please report them. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html