On 10/22/2014 01:34 AM, Ralf Baechle wrote:
This question comes up every once in a while and I've also been approached during ELCE in Düsseldorf why there is no single MIPS kernel for all platforms, so I thought I should post a writeup on the topic. The primary reason is that MIPS kernels are using non-PIC kernels. This means code is linked to a particular absolute address. The link address depends on the memory range available on a particular system's available memory range - there is no one size that fits all systems, not even a large fraction of supported systems.
There is another reason to have a relocatable kernel: The security people are starting to demand it so that they can randomize the load address.
What does it take to make kernels relocatable? A current kernel is not relocatable. One might do something along the lines of userland where the dynamic linker is in a similar situation and needs to first relocate itself before it can perform its actual job. Two approaches. First keeping the non-PIC code. That requires keeping the entire relocation. A lasat_defconfig vmlinux is 5733098 bytes but built with --emit-relocs to keep the reloc information in the final binary the vmlinux file grows to 7217342 bytes! A quick look at the reloc sections: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 2] .rel.text REL 00000000 461538 0eedf8 08 34 1 4 [ 4] .rel__ex_table REL 00000000 550330 0040e0 08 34 3 4 [ 8] .rel.rodata REL 00000000 554410 0310e0 08 34 7 4 [10] .rel.pci_fixup REL 00000000 5854f0 000998 08 34 9 4 [12] .rel__ksymtab REL 00000000 585e88 00b3b0 08 34 11 4 [14] .rel__ksymtab_gpl REL 00000000 591238 007180 08 34 13 4 [17] .rel__param REL 00000000 5983b8 000858 08 34 16 4 [19] .rel__modver REL 00000000 598c10 000038 08 34 18 4 [21] .rel.data REL 00000000 598c48 00a130 08 34 20 4 [23] .rel.init.text REL 00000000 5a2d78 00f008 08 34 22 4 [25] .rel.init.data REL 00000000 5b1d80 001d08 08 34 24 4 [27] .rel.exit.text REL 00000000 5b3a88 000b78 08 34 26 4 The approach could probably be optimized but as a first order approximation this demonstrates there would be plenty of bloat to the binary. Positive side of this approach: no runtime penalty.
This is the approach I was thinking of taking. There would be a small PIC wrapper that applied the relocations, and then passed control to the real entry point.
We would have to be careful of the ex_table, as that is now sorted at build time. For that, we could go to the scheme used by x86, and have that addresses in the ex_table be relative, build time sorting is already working for x86 relocatable kernels.
David Daney.