On Tue, Nov 15, 2011 at 4:59 PM, Ian Lance Taylor <iant@xxxxxxxxxx> wrote: > Jason Garrett-Glaser <jason@xxxxxxxx> writes: > >> From talking with others, there appears to be a problem with function >> calls in inline asm on x86_64: the call clobbers the first 8 bytes of >> the stack red zone, which GCC is allowed to use for other data in the >> function. This is a problem even if the function being called doesn't >> use the stack, because "call" itself does use the stack. Besides the >> extremely hacky sequence of: >> >> sub esp, 128 >> call func >> add esp, 128 >> >> Is there a way to tell gcc not to use the red zone in a function, or >> that part of the red zone is going to be clobbered by the inline >> assembly code? > > In general making function calls from asm code is not supported, and > this is one of the reasons why that is so. There are many targets for > which gcc optimizes leaf functions differently from non-leaf functions. > An asm with a function call turns a leaf function into a non-leaf > function, but gcc doesn't know that that is happening. This causes > things to break. Inline asm would naturally be targetting a very specific target (in this case, x86_32 or x86_64). Are there ways that things break besides the red zone on these architectures? > > You can work around this specific issue by using -mno-red-zone when you > compile the file containing the asm. I can't guarantee that you won't > run into other issues. > > Ian > For an example, here's one use of calling within inline assembly: static ALWAYS_INLINE void x264_cabac_encode_decision( x264_cabac_t *cb, int i_ctx, int b ) { asm( "call _x264_cabac_encode_decision_asm\n" :"+c"(i_ctx),"+d"(b), "+m"(*cb) :"a"(cb) ); } By explicitly declaring the call, we can define a custom calling convention, as well as explicitly state what memory the function modifies and which registers are retained. For example, this assembly function does not modify the first argument, eax ("cb"), so this can be declared in the constraint list accordingly. This is a nice balance between inlining and not-inlining: inlining every instance of this function (x264_cabac_encode_decision_asm, not shown) would significantly slow the program due to L1I cache thrashing, but calling it normally clobbers many registers that the function doesn't use, as well as require the stack (on x86_32), and so forth. The ability to optimize a function with a custom calling convention in mind can be useful for relatively small functions like this that need to be repeatedly called. Jason