In some cases, it is desirable as an optimization not to save any callee-saved registers in the function prologue. This is common for functions which are at the lowest frame, where there is nothing to return to, and unwinding cannot proceed, either. However, GCC seems to generate code for saving registers even for such functions, for example: int f1 (int); __attribute__ ((noreturn, nothrow)) void f2 (void) { int x1 = f1 (1); int x2 = f1 (2); int x3 = f1 (3); int x4 = f1 (4); f1 (x1); f1 (x2); f1 (x3); f1 (x4); __builtin_unreachable (); } yields this on x86-64 (with GCC 9): f2: pushq %r14 movl $1, %edi pushq %r13 pushq %r12 pushq %rbp subq $8, %rsp call f1@PLT movl $2, %edi movl %eax, %r14d call f1@PLT movl $3, %edi movl %eax, %r13d call f1@PLT movl $4, %edi movl %eax, %r12d call f1@PLT movl %r14d, %edi movl %eax, %ebp call f1@PLT movl %r13d, %edi call f1@PLT movl %r12d, %edi call f1@PLT movl %ebp, %edi call f1@PLT Is there a way to avoid this? This would allow us to reduce stack usage of every thread by a couple of words. Thanks, Florian