On Tue, Feb 18, 2014 at 10:49:27AM -0800, Linus Torvalds wrote: > On Tue, Feb 18, 2014 at 10:21 AM, Peter Sewell > <Peter.Sewell@xxxxxxxxxxxx> wrote: > > > > This is a bit more subtle, because (on ARM and POWER) removing the > > dependency and conditional branch is actually in general *not* equivalent > > in the hardware, in a concurrent context. > > So I agree, but I think that's a generic issue with non-local memory > ordering, and is not at all specific to the optimization wrt that > "x?42:42" expression. > > If you have a value that you loaded with a non-relaxed load, and you > pass that value off to a non-local function that you don't know what > it does, in my opinion that implies that the compiler had better add > the necessary serialization to say "whatever that other function does, > we guarantee the semantics of the load". > > So on ppc, if you do a load with "consume" or "acquire" and then call > another function without having had something in the caller that > serializes the load, you'd better add the lwsync or whatever before > the call. Exactly because the function call itself otherwise basically > breaks the visibility into ordering. You've basically turned a > load-with-ordering-guarantees into just an integer that you passed off > to something that doesn't know about the ordering guarantees - and you > need that "lwsync" in order to still guarantee the ordering. > > Tough titties. That's what a CPU with weak memory ordering semantics > gets in order to have sufficient memory ordering. And that is in fact what C11 compilers are supposed to do if the function doesn't have the [[carries_dependency]] attribute on the corresponding argument or return of the non-local function. If the function is marked with [[carries_dependency]], then the compiler has the information needed in both compilations to make things work correctly. Thanx, Paul > And I don't think it's actually a problem in practice. If you are > doing loads with ordered semantics, you're not going to pass the > result off willy-nilly to random functions (or you really *do* require > the ordering, because the load that did the "acquire" was actually for > a lock! > > So I really think that the "local optimization" is correct regardless. > > Linus > -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html