Steven Rostedt <rostedt@xxxxxxxxxxx> writes: > Compiling with -O2 (which gives no warning) (x86_64) produces: > > 0000000000000000 <fn>: > 0: 48 83 ec 08 sub $0x8,%rsp > 4: e8 00 00 00 00 callq 9 <fn+0x9> > 5: R_X86_64_PC32 e-0x4 > 9: 48 85 c0 test %rax,%rax > c: 74 1a je 28 <fn+0x28> > e: e8 00 00 00 00 callq 13 <fn+0x13> > f: R_X86_64_PC32 f-0x4 > 13: 48 85 c0 test %rax,%rax > 16: 74 10 je 28 <fn+0x28> > 18: 48 83 c4 08 add $0x8,%rsp > 1c: e9 00 00 00 00 jmpq 21 <fn+0x21> > 1d: R_X86_64_PC32 g-0x4 > 21: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) > 28: 48 83 c4 08 add $0x8,%rsp > 2c: c3 retq > > and compiling with -Os: > > 0000000000000000 <fn>: > 0: 55 push %rbp > 1: 53 push %rbx > 2: 51 push %rcx > 3: e8 00 00 00 00 callq 8 <fn+0x8> > 4: R_X86_64_PC32 e-0x4 > 8: 48 85 c0 test %rax,%rax > b: 48 89 c3 mov %rax,%rbx > e: 74 08 je 18 <fn+0x18> > 10: e8 00 00 00 00 callq 15 <fn+0x15> > 11: R_X86_64_PC32 f-0x4 > 15: 48 89 c5 mov %rax,%rbp > 18: 48 85 ed test %rbp,%rbp > 1b: 74 0d je 2a <fn+0x2a> > 1d: 48 85 db test %rbx,%rbx > 20: 74 08 je 2a <fn+0x2a> > 22: 5a pop %rdx > 23: 5b pop %rbx > 24: 5d pop %rbp > 25: e9 00 00 00 00 jmpq 2a <fn+0x2a> > 26: R_X86_64_PC32 g-0x4 > 2a: 58 pop %rax > 2b: 5b pop %rbx > 2c: 5d pop %rbp > 2d: c3 retq > > Which is 1 byte more than -O2. I would think that -Os would be smaller. Ideally, it should be, yes. The -Os code would be smaller except that it needs to save a register across a function call, which forces it to push and pop %rbx, which in turn means that stack alignment adds yet another push and two pop instructions. It's a heuristic failure. Ian