On 04.10.2013, at 14:33, Paul Mackerras wrote: > On Fri, Oct 04, 2013 at 01:59:25PM +0200, Alexander Graf wrote: >> >> On 04.10.2013, at 13:45, Paul Mackerras wrote: >> >>> When an interrupt or exception happens in the guest that comes to the >>> host, the CPU goes to hypervisor real mode (MMU off) to handle the >>> exception but doesn't change the MMU context. After saving a few >>> registers, we then clear the "in guest" flag. If, for any reason, >>> we get an exception in the real-mode code, that then gets handled >>> by the normal kernel exception handlers, which turn the MMU on. This >>> is disastrous if the MMU is still set to the guest context, since we >>> end up executing instructions from random places in the guest kernel >>> with hypervisor privilege. >>> >>> In order to catch this situation, we define a new value for the "in guest" >>> flag, KVM_GUEST_MODE_HOST_HV, to indicate that we are in hypervisor real >>> mode with guest MMU context. If the "in guest" flag is set to this value, >>> we branch off to an emergency handler. For the moment, this just does >>> a branch to self to stop the CPU from doing anything further. >> >> I don't understand how you get there. The only case I can imagine where you'd hit a normal Linux handler while in guest MMU context is a bug in the complex real mode handling code. > > A bug is the usual case. I think it is also possible (though very > unlikely) to get a machine check interrupt, since they can come at any > time. > >> So basically what you're doing is you're changing the "guest mode" bit to HOST_NV while you're executing these. >> >> The other change this patch does is it postpones the return to GUEST_MODE_NONE to after fast-path handling of interrupt exits. >> >> What if you simply don't introduce a new mode but instead only postpone the GUEST_MODE_NONE switch to later? Worst case that can happen is that your bug spins the CPU into handling that exit in a tight loop - not much different from your explicit spin, no? > > I did it like that so that we have a chance to save away the register > state for the point where the exception happened separately from the > guest state. It can be very useful for debugging to have both sets. > The other thing of course is that if I did what you suggest and then > happened not to hit the exception on the second time through, we would > end up with corrupted guest state and no indication that it was > corrupted (since the register state for the bad exception would get > saved away in the vcpu struct). > > I admit I haven't written the code to save away the register state > when one of these bad exceptions happens; that's partly because in the > lab we have ways of getting the register state directly from the CPU, > but I'm certainly intending to write that code soon. Fair enough, but I think doing that additional code when we only have a single register available and then even stall the CPU on a memory write to store away and load the state doesn't really help performance. Either way, applied to ppc-next. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html