I've noticed in the disassemblies of code that GCC generates (via the -save-temps option) that it's trying to avoid pushing arguments onto the stack when calling functions. Instead, it makes some space on the stack in the calling function, and movs arguments directly on to it (even with -Os). Then, for a different reason, I added -mno-stack-arg-probe to the command line arguments, and noticed that the number of lines of *.s output generated in my project had gone up by about 3%, while the size of the resulting binary had gone down by about 7%. When I looked at the disassemblies again to try to find out why, I find it's suddenly passing arguments via push. Example function follows. Without -mno-stack-arg-probe: __ZN6screen6scrollEv: 1 push ebp 2 mov ebp, esp 3 push esi 4 push ebx 5 sub esp, 16 6 mov esi, DWORD PTR __ZL5width 7 sal esi 8 mov ebx, DWORD PTR __ZL16framebuffer_size 9 sub ebx, esi 10 mov DWORD PTR [esp+8], ebx 11 lea ecx, [esi+753664] 12 mov DWORD PTR [esp+4], ecx 13 mov DWORD PTR [esp], 753664 14 call __Z7memmovePvPKvj 15 mov DWORD PTR [esp+8], esi 16 movzx edx, BYTE PTR __ZL9backcolor 17 sal edx, 4 18 or dl, BYTE PTR __ZL9forecolor 19 movzx eax, dl 20 sal eax, 8 21 mov DWORD PTR [esp+4], eax 22 add ebx, 753664 23 mov DWORD PTR [esp], ebx 24 call __Z9memset_16Pvij 25 add esp, 16 26 pop ebx 27 pop esi 28 leave 29 ret With -mno-stack-arg-probe: __ZN6screen6scrollEv: 1 push ebp 2 mov ebp, esp 3 push esi 4 push ebx 5 mov esi, DWORD PTR __ZL5width 6 sal esi 7 mov ebx, DWORD PTR __ZL16framebuffer_size 8 sub ebx, esi 9 push ecx 10 push ebx 11 lea ecx, [esi+753664] 12 push ecx 13 push 753664 14 call __Z7memmovePvPKvj 15 add esp, 12 16 push esi 17 movzx edx, BYTE PTR __ZL9backcolor 18 sal edx, 4 19 or dl, BYTE PTR __ZL9forecolor 20 movzx eax, dl 21 sal eax, 8 22 push eax 23 add ebx, 753664 24 push ebx 25 call __Z9memset_16Pvij 26 add esp, 16 27 lea esp, [ebp-8] 28 pop ebx 29 pop esi 30 leave 31 ret This behavior seems extremely peculiar. Can anyone shed some light?