On Wed, Nov 11, 2009 at 10:42:31AM +0800, Wu Zhangjin wrote: > > -mlong-calls really degrades performance. I have seen things like 6% > > drop in network packet forwarding rates with -mlong-calls. > > > > so much drop? seems only two instructions added for it: lui, addi. from > this view point, I think the -fno-omit-frame-pointer(add, sd, move...) > will also bring with much drop. The calling sequence is quite badly bloated. Example: Normal 32/64-bit subroutine call: jal symbol 32-bit with -mlong-call: lui $25, %hi(foo) addiu $25, %lo(foo) jalr $25 64-bit with -mlong-call: lui $25, %highest(foo) lui $2, %hi(foo) daddiu $25, %higher(foo) daddiu $2, %lo(foo) dsll $25, 32 daddu $25, $2 jalr $25 So not considering the possible cost of the delay slot that's 1 vs. 3 vs. 7 instructions. Last I checked ages ago gcc didn't apropriately consider this cost when generating -mlong-calls code and Linux these days also is optimized under the assumption that subroutine calls are cheap. It's time that we get a -G optimization that works for the kernel; it would allow to cut down the -mlong-calls calling sequence to just: lw/ld $25, offset($gp) jalr $25 > It's time to remove them? -mlong-calls, -fno-omit-frame-pointer. > > > It would be better to fix all the tools so that they could handle both > > -mlong-calls and -mno-long-calls code. > > > > It's totally possible, will try to make it work later. I just wanted the > stuff simple, but if it really brings us with much drop, it's time to > fix it. Ralf