All, One thing that oVirt would like to have (and that might be useful for other users) is a call that would do some basic sanity checking for live migration. This call would go over to the remote libvirtd, do some checks, and return whether we think migration is likely to succeed. Note that I say "likely to succeed", because there are certainly things that can cause migration to fail after we've made checks, but anything is better than what we have today ("try it and pray"). Now, in order for this call to be widely useful, I think we would have to allow the caller to specify *which* of the available checks they would like to perform, and then have some sort of return value that indicated if there are show-stopper problems, or just problems that may cause things to be sub-optimal on the remote side. The caller could then decide what action it wants to take. There is also a corollary to the "is it sane for me to migrate", which is, given two hosts A and B, what's the lowest common denominator I need to run my guest at so that migration will likely be successful between them. This could also be used by management apps to make sure things are configured properly for the guest before ever starting it. The biggest problem with implementing these calls, however, is that there is no comprehensive list of things we should check. This e-mail is an attempt to write down some of the more obvious things we need to check, and to garner discussion of things I might have missed. Once I have a proper list, I'll add it to the TODO page on the libvirt Wiki so it's at least somewhere permanent. Note that we don't have to implement *all* of these as a first go at this; if we leave it open enough, we can add more checks as we go along without breaking compatibility. MIGRATION CRITERIA: 0) Matching hypervisors - seems obvious, but I'm not sure if we have these checks today. Make sure we don't try to migrate Xen to KVM or vice-versa. We also might want to think of at least warning the caller if you try to migrate from a "newer" hypervisor (say, Xen 3.2) to an "older" hypervisor (say, Xen 3.1). That should, in theory, work, but maybe the caller would prefer not to do that if possible. Rich has pointed out that KVM and Xen are accidentally incompatible in libvirt, but we should make it explicit. 1) Matching CPU architectures - also obvious, but as far as I know today, there's no checking for this (well, at least in Xen; I don't know about libvirt). So you can happily attempt to migrate from i386 -> ia64, and watch the fireworks. We also need to make sure you can't migrate x86_64 -> i386. I believe i386 -> x86_64 should work, but this might be hypervisor dependent. 2) Matching CPU vendors - this one isn't a hard requirement; given the things below, we may still be likely to succeed even if we go from AMD to Intel or vice-versa. It still might be useful information for the caller to know. 3) CPU flags - the CPU flags of the destination *must* be a superset of the CPU flags that were presented to the guest at startup. Many OS's and application check for CPU flags once at startup to choose optimized routines, and then never check again; if they happened to select sse3, and sse3 is not there on the destination, then they will (eventually) crash. This is where the CPU masking technology and the lowest common denominator libvirt call can make a big difference. If you make sure to mask some of the CPU flags off of the guest when you are first creating it, then the destination host just needs a superset of the flags that were presented to the guest at bootup, which makes the problem easier. 4) Number of CPUs - generally, you want the destination to have at least one physical CPU for each virtual CPU assigned to the guest. However, I can see use cases where this might not be the case (temporary or emergency migrations). So this would probably be a warning, and the caller can make the choice of whether to proceed. 5a) Memory - non-NUMA -> non-NUMA - fairly straightforward. The destination must have enough memory to fit the guest memory. We might want to do some "extra" checking on the destination to make sure we aren't going to OOM the destination as soon as we arrive. 5b) Memory - non-NUMA -> NUMA - A little trickier. There are no cpusets we have to worry about, since we are coming from non-NUMA, but for absolute best performance, we should try to fit the whole guest into a single NUMA node. Of course, if that node is overloaded, that may be a bad idea. NUMA placement problem, basically. 5c) Memory - NUMA -> non-NUMA - Less tricky. On the destination, all memory is "equally" far away, so no need to worry about cpusets. Just have to make sure that there is enough memory on the destination for the guest. 5d) Memory - NUMA -> NUMA - Tricky, just like case 5b). Need to determine if there is enough memory in the machine first, then check if we can fit the guest in a single node, and also check if we can match the cpuset from the source on the destination. 6a) Networks - at the very least, the destination must have the same bridges configured as the source side. Whether those bridges are hooked to the same physical networks as the source or not is another question, and may be outside the bounds of what we can/should check. 6b) Networks - we need to make sure that the device model on the remote side supports the same devices as the source side. That is, if you have a e1000 nic on the source side, but your destination doesn't support it, you are going to fail the migration. 7a) Disks - we have to make sure that all of the disks on the source side are available on the destination side, at the same paths. To be entirely clear, we have to make sure that the file on the destination side is the *same* file as on the source side, not just a file with the same name. For traditional file-based, the best we can do may be path names. For device names (like LVM, actual disk partitions, etc.), we might be able to take advantage of device enumaration API's and validate that the device info is the same (UUID matching, etc.). 7b) Disks - Additionally, we need to make sure that the device model on the remote side supports the same devices as the source side. That is, if you have a virtio drive on the source side, but your destination host doesn't support virtio on the backend, you are going to fail. (virtio might be a bad example, but there might be further things in the device model in the future that we might not necessarily have on both ends). ------ That's the absolute basic criteria. More esoteric/less thought-out criteria follow: 8) Time skew - this is less thought out at the moment, but if you calibrated your lpj at boot time, and now you migrate to a host with a different clock frequency, your time will run either fast or slow compared to what you expect. Also synchronized vs. unsynchronized TSC could cause issues, etc. 9) PCI-passthrough - this is actually a check on the *source* side. If the guest is using a PCI passthrough device, it *usually* doesn't make sense to live migrate it. On the other hand, it looks like various groups are trying to make this work (with the bonding of a PV NIC to a PCI-passthrough NIC), so we need to keep that in mind, and not necessarily make this a hard failure. 10) MSR's?? - I've thought about this one before, but I'm not sure what the answer is. Unfortunately, MSR's in virtualization are sort of done hodge-podge. That is, some MSR's are emulated for the guests, some MSR's guests have direct control over, and some aren't emulated at all. This one can get ugly fast; it's probably something we want to leave until later. 11) CPUID?? - Not entirely sure about this one; there is a lot of model-specific information encoded in the various CPUID calls (things like cache size, cache line size, etc. etc). However, I don't know if CPUID instructions are trapped-n-emulated under the different hypervisors, or if they are done right on the processor itself. I guess if an OS or application called this at startup, cached the information, and then checked again later it might get upset, but it seems somewhat unlikely. Things that I've missed? Thanks, Chris Lalancette -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list