On 10/22/2014 01:42 PM, Ralf Baechle wrote:
On Wed, Oct 22, 2014 at 08:19:07PM +0100, Maciej W. Rozycki wrote:
Another reason is that the protocol between the bootloader and the kernel
varies by platform. So you would have to have several different entry
points, one for each booting protocol.
I am not sure how the bootloaders would know which entry point to use.
That's where I foresaw the needs for the ISA style platform probe right
at the kernel entry point before fanning out to a platform-specific
entry point.
Since we already support compressed kernels I'm wondering if relocation
might also be performed by the compression wrapper along with the
hardware probe. That would leave the vmlinux itself untouched and
the wrapper could be installed on the target.
Wouldn't it make sense to make a unified kernel virtually mapped? That
would avoid the issue with RAM being present at different locations across
systems and also if big pages were used, that I believe are available
almost universally across the MIPS family, any performance hit would be
minimal. There would be hardly any increase in the binary image size too.
Run-time mappings such as `kmalloc' or `ioremap' could continue using
unmapped segments.
I think some MIPS III CPUs were restricted to just 4MB max. page size.
NEC VR4xxx I think. Still a pair would map 8MB which on the affected
small memory systems should suffice. 16MB, 64MB are more typical sizes.
R3000 is a different kettle. To 4k or not to 4k is not a question ;-)
Now mapping the kernel alone wouldn't solve the security issue mentioned
by David. The image would still lie around in KSEG0 / XKPHYS for whatever
wants to run over so that should ideally also be a flexible address.
Otoh the mapped kernel certainly would have the lowest size overhead.
I have faint memories of restrictions for TLB instructions or was it
TLB exception handlers into mapped space, would have to do some rtfming
on that topic.
Years ago I did test the impact of one less available TLB entry with
lmbench; the loss was around 2%. That was on a CPU with 64 entries.
We have a private patch that does exactly this, the main motivation was
to place the kernel in the same virtual address 256MB region as the
modules, so that a direct calling sequence can be used in modules.
The resulting module code is much faster, so depending on the work load
it may be a performance win. We see things like IPv6 forwarding
improving something like 6% when IPv6 is built as a module.
Also we have many more TLB entries (128, or 256) so losing one is not a
big deal.
David Daney