On 21 February 2014 19:41, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Fri, Feb 21, 2014 at 11:16 AM, Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> Why would this be any different, especially since it's easy to >> understand both for a human and a compiler? > > Btw, the actual data path may actually be semantically meaningful even > at a processor level. > > For example, let's look at that gcc bugzilla that got mentioned > earlier, and let's assume that gcc is fixed to follow the "arithmetic > is always meaningful, even if it is only syntactic" the letter. > So we have that gcc bugzilla use-case: > > flag ? *(q + flag - flag) : 0; > > and let's say that the fixed compiler now generates the code with the > data dependency that is actually suggested in that bugzilla entry: > > and w2, w2, #0 > ldr w0, [x1, w2] > > ie the CPU actually sees that address data dependency. Now everything > is fine, right? > > Wrong. > > It is actually quite possible that the CPU sees the "and with zero" > and *breaks the dependencies on the incoming value*. For reference: the Power and ARM architectures explicitly guarantee not to do this, the architects are quite clear about it, and we've tested (some cases) rather thoroughly. I can't speak about other architectures. > Modern CPU's literally do things like that. Seriously. Maybe not that > particular one, but you'll sometimes find that the CPU - int he > instruction decoding phase (ie very early in the pipeline) notices > certain patterns that generate constants, and actually drop the data > dependency on the "incoming" registers. > > On x86, generating zero using "xor" on the register with itself is one > such known sequence. > > Can you guarantee that powerpc doesn't do the same for "and r,r,#0"? > Or what if the compiler generated the much more obvious > > sub w2,w2,w2 > > for that "+flag-flag"? Are you really 100% sure that the CPU won't > notice that that is just a way to generate a zero, and doesn't depend > on the incoming values? > > Because I'm not. I know CPU designers that do exactly this. > > So I would actually and seriously argue that the whole C standard > attempt to use a syntactic data dependency as a determination of > whether two things are serialized is wrong, and that you actually > *want* to have the compiler optimize away false data dependencies. > > Because people playing tricks with "+flag-flag" and thinking that that > somehow generates a data dependency - that's *wrong*. It's not just > the compiler that decides "that's obviously nonsense, I'll optimize it > away". The CPU itself can do it. > > So my "actual semantic dependency" model is seriously more likely to > be *correct*. Not just t a compiler level. > > Btw, any tricks like that, I would also take a second look at the > assembler and the linker. Many assemblers do some trivial > optimizations too. That's certainly something worth checking. > Are you sure that "and w2, w2, #0" really ends > up being encoded as an "and"? Maybe the assembler says "I can do that > as a "mov w2,#0" instead? Who knows? Even power and ARM have their > variable-sized encodings (there are some "compressed executable" > embedded power processors, and there is obviously Thumb2, and many > assemblers end up trying to use equivalent "small" instructions.. > > So the whole "fake data dependency" thing is just dangerous on so many levels. > > MUCH more dangerous than my "actual real dependency" model. > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html