Re: Inline asm function calls + red zone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 15, 2011 at 4:59 PM, Ian Lance Taylor <iant@xxxxxxxxxx> wrote:
> Jason Garrett-Glaser <jason@xxxxxxxx> writes:
>
>> From talking with others, there appears to be a problem with function
>> calls in inline asm on x86_64: the call clobbers the first 8 bytes of
>> the stack red zone, which GCC is allowed to use for other data in the
>> function.  This is a problem even if the function being called doesn't
>> use the stack, because "call" itself does use the stack.  Besides the
>> extremely hacky sequence of:
>>
>> sub esp, 128
>> call func
>> add esp, 128
>>
>> Is there a way to tell gcc not to use the red zone in a function, or
>> that part of the red zone is going to be clobbered by the inline
>> assembly code?
>
> In general making function calls from asm code is not supported, and
> this is one of the reasons why that is so.  There are many targets for
> which gcc optimizes leaf functions differently from non-leaf functions.
> An asm with a function call turns a leaf function into a non-leaf
> function, but gcc doesn't know that that is happening.  This causes
> things to break.

Inline asm would naturally be targetting a very specific target (in
this case, x86_32 or x86_64).  Are there ways that things break
besides the red zone on these architectures?

>
> You can work around this specific issue by using -mno-red-zone when you
> compile the file containing the asm.  I can't guarantee that you won't
> run into other issues.
>
> Ian
>

For an example, here's one use of calling within inline assembly:

static ALWAYS_INLINE void x264_cabac_encode_decision( x264_cabac_t
*cb, int i_ctx, int b )
{
    asm(
        "call _x264_cabac_encode_decision_asm\n"
        :"+c"(i_ctx),"+d"(b), "+m"(*cb)
        :"a"(cb)
    );
}

By explicitly declaring the call, we can define a custom calling
convention, as well as explicitly state what memory the function
modifies and which registers are retained.  For example, this assembly
function does not modify the first argument, eax ("cb"), so this can
be declared in the constraint list accordingly.

This is a nice balance between inlining and not-inlining: inlining
every instance of this function (x264_cabac_encode_decision_asm, not
shown) would significantly slow the program due to L1I cache
thrashing, but calling it normally clobbers many registers that the
function doesn't use, as well as require the stack (on x86_32), and so
forth.  The ability to optimize a function with a custom calling
convention in mind can be useful for relatively small functions like
this that need to be repeatedly called.

Jason



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux