On Thu, Mar 8, 2012 at 6:47 PM, Richard Earnshaw <Richard.Earnshaw@xxxxxxx> wrote: > On 08/03/12 17:21, Nicolas Pitre wrote: >> On Thu, 8 Mar 2012, Richard Earnshaw wrote: >> >>> On 02/03/12 21:15, Nicolas Pitre wrote: >>>> So, to me, the gcc documentation is perfectly clear on this topic. >>>> there really _is_ a guarantee that those asm marked variables will be in >>>> the expected registers on entry to the inline asm, given that the >>>> variable is _also_ listed as an operand to the asm statement. But only >>>> in that case. >>>> >>>> It is true that gcc may reorder other function calls or other code >>>> around the inline asm and then intervening code can clobber any >>>> registers. Then it is up to gcc to preserve the variable's content >>>> elsewhere when its register is used for other purposes, and restore it >>>> when some inline asm statement is referring to it. >>>> >>>> And if gcc does not do this then it is buggy. Version 3.4.0 of gcc was >>>> buggy. No other gcc versions in the last 7 years had such a problem or >>>> the __asmeq macro in the kernel would have told us. >>>> >>>>> Or, to summarise another way, there is no way to control which register >>>>> is used to pass something to an inline asm in general (often we get away >>>>> with this, and there are a lot of inline asms in the kernel that assume >>>>> it works, but the more you inline the more likely you are to get nasty >>>>> surprises). >>>> >>>> This statement is therefore unfounded and wrong. Please direct the >>>> tools guy who mislead you to the above gcc documentation. >>>> >>> >>> The problem is not really about re-ordering functions but about implicit >>> functions that come from the source code; for example >>> >>> int foo (int a, int b) >>> { >>> register int x __asm__("r0") = 33; >>> >>> register int c __asm__("r1") = a / b; /* Ooops, clobbers r0 with >>> division function call. */ >>> >>> asm ("svc 0" : : "r" (x)); >>> } >>> >>> There's nothing in the specification to say what happens if there's a >>> statement in the code that causes an implicit clobber of your assembly >>> register. >> >> I'm sure gcc is full of implicit behaviors that are not mentioned in >> the specification. But as long as the specification is respected, then >> there is no need to mention any unobservable side effects from a program >> flow point of view, right? >> >> Why wouldn't gcc be able to respect the documented feature by >> preventing live variable from being clobbered and reloading them in >> the specified register at the inline asm entry point, just like it does >> for function calls? >> >> Here's an example code that shows that, unfortunately, gcc is still >> broken with regards to the documented behavior: >> >> extern int bar(int); >> int foo(int y) >> { >> register int x __asm__("r1") = 33; >> y += bar(x); >> asm ("@ x should be live in %0 here" : "+r" (x) : "r" (y)); >> y += bar(x); >> asm ("@ x should be live in %0 here" : "+r" (x) : "r" (y)); >> return x; >> } >> >> Result is: >> >> foo: >> stmfd sp!, {r4, lr} >> mov r4, r0 >> mov r0, #33 >> bl bar >> add r4, r0, r4 >> @ x should be live in r1 here >> mov r0, r1 >> bl bar >> add r0, r0, r4 >> @ x should be live in r1 here >> mov r0, r1 >> ldmfd sp!, {r4, lr} >> bx lr >> >> To me this is clearly a bug if gcc is not able to meet the documented >> expectation. And the documented expectation is not at all unreasonable. >> > No, in this case it is presumed that /you/ know that calling bar() will > modify x. Thus the code is either well defined (if you know what is in > r1 after each call to bar), or undefined (if you can't say anything > about r1 after each call). It could be argued that since the set of registers involved in the PCS are well-known, then if the programmer assigns a variable to one of those registers, then that is a conscious aliasing of the variable with a global register which can be destroyed at any time as a consequence of the ABI. Because there are few guarantees about how the compiler will or won't transform the code, this suggessts that asm("rX") annotations can't work reliably for r0-r3 or r12 with the ARM PCS. Indeed, the GCC docs do in fact have this to say: "register int *p1 asm ("r0") = ...; register int *p2 asm ("r1") = ...; register int *result asm ("r0"); asm ( [...] ); [...] beware that a register that is call-clobbered by the target ABI will be overwritten by any function call in the assignment including library calls for arithmetic operators. Also a register may be clobbered when generating some operations, like variable shift, memory copy or memory move on x86. Assuming it is a call-clobbered register, this may happen to `r0' above by the assignment to `p2'. Ig you have to use such a register, use temporary variables for expressions between the register assignment and use: int t1 = ...; register int *p1 asm("r0") = ...; register int *p2 asm("r1") = t1; register int *result asm("r0"); asm ( [...] )" But this is at least somewhat in conflict with "The compiler's data flow analysis is capable of determining where the specified registers contain live values, and where they are available for other uses." It also seems to assume -O0 type behaviour where the compiler is doing a straightforward sequential translation of the code. Why it is guaranteed that the assignment to p2 now certainly does not clobber p1 (even as a side effect), what the implied aliasing of result with p1 actually guarantees (or whether the compiler really understands it at all); or what constraints there are on the compiler reordering or inserting random extraneous code into the above, I have no idea. Such assumptions don't feel very safe in the presence of optimisation. In other words, all sorts of undocumented guarantees beyond the C language are needed for it even to be possible to interpret what the above code examples should mean in the first place. The documentation leaves a lot of questions unanswered, but it does at least suggest that other arches have the same kind of potential pitfalls that we observed on ARM. Register variables feel like a red herring though. We're only using those because we can't do the needful thing and actually desscribe these constraints in the asm constraints (which would seem to be the right place). We specifically don't care where those values are except at the boundaries of the asm block itself. Is there a reason why ARM gcc doesn't provide the ability to specify such exact-register constraints, or is this more for historical reasons? It is possible? Cheers ---Dave -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html