Re: [PATCH 2/2] KVM: PPC: Book3S HV: Better handling of exceptions that happen in real mode

Paul Mackerras <paulus@xxxxxxxxx> · Sat, 5 Oct 2013 09:42:10 +1000

On Fri, Oct 04, 2013 at 02:56:31PM +0200, Alexander Graf wrote:
> 
> On 04.10.2013, at 14:33, Paul Mackerras wrote:
> 
> > On Fri, Oct 04, 2013 at 01:59:25PM +0200, Alexander Graf wrote:
> >> 
> >> On 04.10.2013, at 13:45, Paul Mackerras wrote:
> >> 
> >>> When an interrupt or exception happens in the guest that comes to the
> >>> host, the CPU goes to hypervisor real mode (MMU off) to handle the
> >>> exception but doesn't change the MMU context.  After saving a few
> >>> registers, we then clear the "in guest" flag.  If, for any reason,
> >>> we get an exception in the real-mode code, that then gets handled
> >>> by the normal kernel exception handlers, which turn the MMU on.  This
> >>> is disastrous if the MMU is still set to the guest context, since we
> >>> end up executing instructions from random places in the guest kernel
> >>> with hypervisor privilege.
> >>> 
> >>> In order to catch this situation, we define a new value for the "in guest"
> >>> flag, KVM_GUEST_MODE_HOST_HV, to indicate that we are in hypervisor real
> >>> mode with guest MMU context.  If the "in guest" flag is set to this value,
> >>> we branch off to an emergency handler.  For the moment, this just does
> >>> a branch to self to stop the CPU from doing anything further.
> >> 
> >> I don't understand how you get there. The only case I can imagine where you'd hit a normal Linux handler while in guest MMU context is a bug in the complex real mode handling code.
> > 
> > A bug is the usual case.  I think it is also possible (though very
> > unlikely) to get a machine check interrupt, since they can come at any
> > time.
> > 
> >> So basically what you're doing is you're changing the "guest mode" bit to HOST_NV while you're executing these.
> >> 
> >> The other change this patch does is it postpones the return to GUEST_MODE_NONE to after fast-path handling of interrupt exits.
> >> 
> >> What if you simply don't introduce a new mode but instead only postpone the GUEST_MODE_NONE switch to later? Worst case that can happen is that your bug spins the CPU into handling that exit in a tight loop - not much different from your explicit spin, no?
> > 
> > I did it like that so that we have a chance to save away the register
> > state for the point where the exception happened separately from the
> > guest state.  It can be very useful for debugging to have both sets.
> > The other thing of course is that if I did what you suggest and then
> > happened not to hit the exception on the second time through, we would
> > end up with corrupted guest state and no indication that it was
> > corrupted (since the register state for the bad exception would get
> > saved away in the vcpu struct).
> > 
> > I admit I haven't written the code to save away the register state
> > when one of these bad exceptions happens; that's partly because in the
> > lab we have ways of getting the register state directly from the CPU,
> > but I'm certainly intending to write that code soon.
> 
> Fair enough, but I think doing that additional code when we only have a single register available and then even stall the CPU on a memory write to store away and load the state doesn't really help performance.

That's what register renaming, branch prediction and speculative
execution are for. :)

> Either way, applied to ppc-next.

Thanks,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html