Franck wrote:
2006/1/26, Nigel Stephens <nigel@xxxxxxxx>:
1) Using -march=4ksd reduces the cost of a multiply by 1 instruction
(from 5 to 4 cycles), so a few more constant multiplications, previously
expanded into a sequence of shifts, adds and subs, may now be replaced
by a shorter sequence of "li" and "mul" instructions.
Is it really specific to 4ksd cpu ? Could this behaviour be triggered
by other options ?
Yes, when you use -Os the compiler uses the instruction cost (1) of a
mul, instead of the cycle cost (4), so it will be even more likely to
replace the expanded shift/add sequence by a mul.
text data bss dec hex filename
2099642 110784 81956 2292382 22fa9e vmlinux-4ksd
2136269 110784 81956 2329009 2389b1 vmlinux-mips32r2
1953086 110784 81956 2145826 20be22 vmlinux-4ksd-Os
1954489 110784 81956 2147229 20c39d vmlinux-mips32r2-Os
I now have to check that your first and second points don't have too
much bad impact on the overall speed although I don't know how to
measure that...But if so, I could safely use -march=mips32r2 -Os
options.
You could, but why not stick with -march=4ksd if that's your CPU of
choice? It appears to result in marginally smaller code even when using
-Os, and should have (slightly) better performance than a generic
mips32r2 kernel?
Nigel