On Mon, Jul 18, 2011 at 03:46:56PM -0700, Frank Berreth wrote: > Hello, > > I am working on save/restore code (independent of Qemu or KVM tools) > and am having trouble doing 'the right thing' with KVM it seems. I do > have trouble finding good documentation on specifically save or > restore (besides the description of IOCTLs) and there is a high chance > I am doing something goofy here. > > The way I am save/restoring is simply saving the state I get through > the different KVM IOCTLs and dumping it back on restore. The exception > is restoring of the MSRs, where I first get the list, then get the > MSRS and save/restore only those that were filled in by KVM. > > The restore order is the following: > > first restore memory, then > KVM_SET_REGS > KVM_SET_FPU > KVM_SET_SREGS > KVM_SET_MSRS > KVM_SET_MPSTATE > KVM_SET_LAPIC > some user mode devices and then > KVM_SET_PIT > KVM_SET_IRQCHIP (3 times, for MASTER, SLAVE and IOAPIC) KVM_SET_VCPU_EVENTS is missing from this list. Also note this comment from api.txt: NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding operations are complete (and guest state is consistent) only after userspace has re-entered the kernel with KVM_RUN. The kernel side will first finish incomplete operations and then check for pending signals. Userspace can re-enter the guest with an unmasked signal pending to complete pending operations. > I am running different tests to test the save/restore code. One test > for example will compile the linux kernel within the guest while the > VM is saved, destroyed, re-create and finally restored & resumed > during compilation (every 10 seconds) on the same physical machine. > > The behavior I get is different depending on KVM version and CPU. > > Running with kvm-84 gives only sporadic failures, maybe every thousand > restores a double fault in the guest. > > Running with a 2.6.34 kernel gives an 'immediate' double fault in the > guest. The guest does resume work as megabytes of guest data get > modified before the double fault but eventually the guest crashes - > and always it seems in the page fault handler. This is pretty much > 100% reproduceable even with a 'workload' that simply does a "sleep > 1d" in a shell script. $ echo kvm_page_fault > /sys/kernel/debug/tracing/set_event > On Intel CPUs though I get a VMX_INVALID_GUEST_STATE error. The guest > does not run - which tells me I am doing something wrong here. You can load kvm-intel.ko with emulate_invalid_guest_state=1, and inspect guest_state_valid to pinpoint which state is invalid. > Assuming that the KVM versions work fine - what state that comes from > KVM through one of the KVM_GET_* IOCTLs has to be modified/sanitized > on restore? None. Only ordering must be maintained, follow QEMU there. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html