Re: Tail call optimization on function with many arguments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/18/2015 11:30 AM, Lionel Villard wrote:
Hi,

consider this small C program:

   #include <stdlib.h>
   int tailoptimized(int c1, int c2, int c3, int c4, int c5, int c6)
  {
     if (c1 > c5)
         {
             if (c4 < 5)
             {
                 return c3 + c2;
             }
         }
         else if (c5 > 45)
                 return c6;
         return c3 + c4 + c5 + c6;
   }

   int nottailoptimized(int c1, int c2, int c3, int c4, int c5, int c6, int
c7)
   {
       if (c2 > c5)
       {
           if (c2 < 5)
          {
               return c3 + c2;
          }
       }
       else if (c1 > 45)
             return c6;
       return c1 + c4 + c5 + c6;
  }

   int call(int count)
   {
       if (count > 100)
     {
         return tailoptimized(count *2, count, count * 3, count, count,
count);
     }
     return nottailoptimized(count, count* 2, count, count, count, count,
count * 4);
   }

   int main (int argc, char* argv[])
   {
       call(atoi(argv[0]));
   }

I used this command to compile it:

     gcc -O1 -foptimize-sibling-calls -S -c test.c

and generated assembly code for the 'call' function is:

call:
.LFB17:
     .cfi_startproc
     cmpl    $100, %edi
     jle    .L12
     leal    (%rdi,%rdi), %eax
     leal    (%rax,%rdi), %edx
     movl    %edi, %r9d
     movl    %edi, %r8d
     movl    %edi, %ecx
     movl    %edi, %esi
     movl    %eax, %edi
     jmp    tailoptimized
.L12:
     subq    $8, %rsp
     .cfi_def_cfa_offset 16
     leal    (%rdi,%rdi), %esi
     leal    0(,%rdi,4), %eax
     movl    %eax, (%rsp)
     movl    %edi, %r9d
     movl    %edi, %r8d
     movl    %edi, %ecx
     movl    %edi, %edx
     call    nottailoptimized
     addq    $8, %rsp
     .cfi_def_cfa_offset 8
     ret
     .cfi_endproc

As you can see, when a function takes more than 6 arguments
(nottailoptimized in this case), TCO is turned off.

What's the reason for this limitation (number of registers?)? Is there a
flag/option that would allow more than 6 arguments?
There's no inherent limit on the number of arguments. It's constrained more by the need to avoid stack allocation in the caller.

In your example "call" must allocate a stack slot for the last arugment to "nottailoptimized". The x86_64 ABI mandates that "call" would also be responsible for deallocating that stack slot. That obviously can't happen if the call to "nottailoptimized" was turned into a tail call.

"tailoptimized" doesn't suffer from this problem because it does not need any of its arguments passed in the stack.


In theory, the tail call optimization could be enhanced to create a clone of "nottailoptimized" which would de-allocate the stack for any arguments and the tail call optimization could jump to that special copy. Nobody's implemented that. It's not even clear if it'd be a good idea or not.

Jeff



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux