On 08.07.2009, at 06:38, Benjamin Herrenschmidt wrote:
On Tue, 2009-07-07 at 16:17 +0200, Alexander Graf wrote:
This is the really low level of guest entry/exit code.
Usually the Linux kernel resides in virtual memory
0xc000000000000000 to
0xffffffffffffffff. These addresses are mapped into every userspace
application.
When going into a 32 bit guest, this is perfectly fine. That one
can't
access memory that high anyways.
Going into a 64 bit guest, the guest kernel probably is in the same
virtual memory region as the host, so we need to switch between
those two.
During normal entry code we're in those virtual addresses though. So
we need a small wrapper in real memory that switches from host to
guest
high SLB state and vice versa.
To store both host and guest state in the SLB, we store guest
kernel SLB
entries in a different range (0x40000000000000000 -
0x7ffffffffffffffff).
For details on which entries go where, please see the patch itself.
Note that we have an unused VSID bit at the moment on 64-bit afaik. We
could probably use that to differenciate guest kernel VSIDs from host
kernel VSIDs. That would avoid having to muck around with the EAs
themselves that much no ?
Well, the problem is that we can't have two ESIDs for the same EA in
the SLB. So what I tried was to have guest ESIDs and host ESIDs
(PAGE_OFFSET+) live in the same SLB by removing the most significant
bit of the guest ESID.
I'm not sure I understand exactly what you are doing here, we should
discuss this on IRC one of these days I suppose. But you should be
able
to just get rid of the host kernel SLBs completely with some care, as
there are some critical code path where taking an exception without
having the SLB entry around for entry 0 and the kernel stack will
blow...
Yeah, I've encountered quite a bunch of those :-).
But just blow them off, and when returning to the kernel, just put
back
the ones that are needed (aka slb_flush_and_rebolt). You will need to
play carefully with that though, look at the code in slb.c, as the
real
pHyp hypervisor that may lie underneath will potentially muck around
the
SLBs and will restore them occasionally from the special in-memory
shadows, so you probably want to switch the content of those too.
Yikes. So pHyp restores SLB entries from a shadow? Sounds like I need
to mess with that one too :-(.
I'm not really fond of all the SLB switching code in general. Best
case would probably be to have a host and guest shadow SLB in the RMA
that the real mode code can take to switch the _full_ SLB.
That way we'd also get rid of the CONTEXT_GUEST stuff in the kernel
module, where we are in Linux, but have guest SLB entries active
already.
Of course none of that will work on legacy iSeries or Power3 but I
think
we can safely say we don't care :-)
Any reason it doesn't work on Power3? :-). It definitely does not work
on iSeries, though the code could be made to work there FWIW.
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html