Re: [RFC][PATCH 0/5] arch: atomic rework

Peter Sewell <Peter.Sewell@xxxxxxxxxxxx> · Fri, 21 Feb 2014 19:48:47 +0000

On 21 February 2014 19:41, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Feb 21, 2014 at 11:16 AM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Why would this be any different, especially since it's easy to
>> understand both for a human and a compiler?
>
> Btw, the actual data path may actually be semantically meaningful even
> at a processor level.
>
> For example, let's look at that gcc bugzilla that got mentioned
> earlier, and let's assume that gcc is fixed to follow the "arithmetic
> is always meaningful, even if it is only syntactic" the letter.
> So we have that gcc bugzilla use-case:
>
>    flag ? *(q + flag - flag) : 0;
>
> and let's say that the fixed compiler now generates the code with the
> data dependency that is actually suggested in that bugzilla entry:
>
>         and     w2, w2, #0
>         ldr     w0, [x1, w2]
>
> ie the CPU actually sees that address data dependency. Now everything
> is fine, right?
>
> Wrong.
>
> It is actually quite possible that the CPU sees the "and with zero"
> and *breaks the dependencies on the incoming value*.

For reference: the Power and ARM architectures explicitly guarantee
not to do this, the architects are quite clear about it, and we've
tested (some cases) rather thoroughly.
I can't speak about other architectures.

> Modern CPU's literally do things like that. Seriously. Maybe not that
> particular one, but you'll sometimes find that the CPU - int he
> instruction decoding phase (ie very early in the pipeline) notices
> certain patterns that generate constants, and actually drop the data
> dependency on the "incoming" registers.
>
> On x86, generating zero using "xor" on the register with itself is one
> such known sequence.
>
> Can you guarantee that powerpc doesn't do the same for "and r,r,#0"?
> Or what if the compiler generated the much more obvious
>
>     sub w2,w2,w2
>
> for that "+flag-flag"? Are you really 100% sure that the CPU won't
> notice that that is just a way to generate a zero, and doesn't depend
> on the incoming values?
>
> Because I'm not. I know CPU designers that do exactly this.
>
> So I would actually and seriously argue that the whole C standard
> attempt to use a syntactic data dependency as a determination of
> whether two things are serialized is wrong, and that you actually
> *want* to have the compiler optimize away false data dependencies.
>
> Because people playing tricks with "+flag-flag" and thinking that that
> somehow generates a data dependency - that's *wrong*. It's not just
> the compiler that decides "that's obviously nonsense, I'll optimize it
> away". The CPU itself can do it.
>
> So my "actual semantic dependency" model is seriously more likely to
> be *correct*. Not just t a compiler level.
>
> Btw, any tricks like that, I would also take a second look at the
> assembler and the linker. Many assemblers do some trivial
> optimizations too.

That's certainly something worth checking.

> Are you sure that "and     w2, w2, #0" really ends
> up being encoded as an "and"? Maybe the assembler says "I can do that
> as a "mov w2,#0" instead? Who knows? Even power and ARM have their
> variable-sized encodings (there are some "compressed executable"
> embedded power processors, and there is obviously Thumb2, and many
> assemblers end up trying to use equivalent "small" instructions..
>
> So the whole "fake data dependency" thing is just dangerous on so many levels.
>
> MUCH more dangerous than my "actual real dependency" model.
>
>                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html