Re: [PATCH 07/23] Add SLB switching code for entry/exit

Alexander Graf <agraf@xxxxxxx> · Wed, 8 Jul 2009 09:23:42 +0200

On 08.07.2009, at 06:38, Benjamin Herrenschmidt wrote:

On Tue, 2009-07-07 at 16:17 +0200, Alexander Graf wrote:
This is the really low level of guest entry/exit code.

Usually the Linux kernel resides in virtual memory  
0xc000000000000000 to
0xffffffffffffffff. These addresses are mapped into every userspace
application.

When going into a 32 bit guest, this is perfectly fine. That one  
can't
access memory that high anyways.

Going into a 64 bit guest, the guest kernel probably is in the same
virtual memory region as the host, so we need to switch between  
those two.

During normal entry code we're in those virtual addresses though. So
we need a small wrapper in real memory that switches from host to  
guest
high SLB state and vice versa.

To store both host and guest state in the SLB, we store guest  
kernel SLB
entries in a different range (0x40000000000000000 -  
0x7ffffffffffffffff).

For details on which entries go where, please see the patch itself.

Note that we have an unused VSID bit at the moment on 64-bit afaik. We
could probably use that to differenciate guest kernel VSIDs from host
kernel VSIDs. That would avoid having to muck around with the EAs
themselves that much no ?

Well, the problem is that we can't have two ESIDs for the same EA in  
the SLB. So what I tried was to have guest ESIDs and host ESIDs  
(PAGE_OFFSET+) live in the same SLB by removing the most significant  
bit of the guest ESID.

I'm not sure I understand exactly what you are doing here, we should
discuss this on IRC one of these days I suppose. But you should be  
able
to just get rid of the host kernel SLBs completely with some care, as
there are some critical code path where taking an exception without
having the SLB entry around for entry 0 and the kernel stack will
blow...

Yeah, I've encountered quite a bunch of those :-).

But just blow them off, and when returning to the kernel, just put  
back
the ones that are needed (aka slb_flush_and_rebolt). You will need to
play carefully with that though, look at the code in slb.c, as the  
real
pHyp hypervisor that may lie underneath will potentially muck around  
the
SLBs and will restore them occasionally from the special in-memory
shadows, so you probably want to switch the content of those too.

Yikes. So pHyp restores SLB entries from a shadow? Sounds like I need  
to mess with that one too :-(.

I'm not really fond of all the SLB switching code in general. Best  
case would probably be to have a host and guest shadow SLB in the RMA  
that the real mode code can take to switch the _full_ SLB.

That way we'd also get rid of the CONTEXT_GUEST stuff in the kernel  
module, where we are in Linux, but have guest SLB entries active  
already.

Of course none of that will work on legacy iSeries or Power3 but I  
think
we can safely say we don't care :-)

Any reason it doesn't work on Power3? :-). It definitely does not work  
on iSeries, though the code could be made to work there FWIW.

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html