On 05/23/2011 04:40 PM, Joerg Roedel wrote:
On Mon, May 23, 2011 at 04:08:00PM +0300, Avi Kivity wrote: > On 05/23/2011 04:02 PM, Joerg Roedel wrote: >> About live-migration with nesting, we had discussed the idea of just >> doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate. >> The problem was that the hypervisor may not expect an INTR intercept. >> >> How about doing an implicit VMEXIT in this case and an implicit VMRUN >> after the vcpu is migrated? > > What if there's something in EXIT_INT_INFO? On real SVM hardware EXIT_INT_INFO should only contain something for exception and npt intercepts. These are all handled in the kernel and do not cause an exit to user-space so that no valid EXIT_INT_INFO should be around when we actually go back to user-space (so that migration can happen). The exception might be the #PF/NPT intercept when the guest is doing very obscure things like putting an exception/interrupt handler on mmio memory, but that isn't really supported by KVM anyway so I doubt we should care. Unless I miss something here we should be safe by just not looking at EXIT_INT_INFO while migrating.
Agree.
>> The nested hypervisor will not see the >> vmexit and the vcpu will be in a state where it is safe to migrate. This >> should work for nested-vmx too if the guest-state is written back to >> guest memory on VMEXIT. Is this the case? > > It is the case with the current implementation, and we can/should make > it so in future implementations, just before exit to userspace. Or at > least provide an ABI to sync memory. > > But I don't see why we shouldn't just migrate all the hidden state (in > guest mode flag, svm host paging mode, svm host interrupt state, vmcb > address/vmptr, etc.). It's more state, but no thinking is involved, so > it's clearly superior. An issue is that there is different state to migrate for Intel and AMD hosts. If we keep all that information in guest memory the kvm kernel module can handle those details and all KVM needs to migrate is the in-guest-mode flag and the gpa of the vmcb/vmcs which is currently executed. This state should be enough for Intel and AMD nesting.
I think for Intel there is no hidden state apart from in-guest-mode (there is the VMPTR, but it is an actual register accessible via instructions). For svm we can keep the hidden state in the host state-save area (including the vmcb pointer). The only risk is that svm will gain hardware support for nesting, and will choose a different format than ours.
An alternative is a fake MSR for storing this data, or just another get/set ioctl pair. We'll have a flags field that says which fields are filled in.
The next benefit is that it works seemlessly even if the state that needs to be transfered is extended (e.g. by emulating a new virtualization hardware feature). This support can be implemented in the kernel module and no changes to qemu are required.
I agree it's a benefit. But I don't like making the fake vmexit part of live migration, if it turns out the wrong choice it's hard to undo it.
-- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html