On Mon, Aug 01, 2022 at 11:03:49AM -0500, Praveen K Paladugu wrote: > Folks, > > We are implementing Live Migration support in "ch" driver of Libvirt. I'd > like to confirm if the approach we have chosen would be accepted upstream > once implemented. > > > Our immediate goal is to implement "Hypervisor Native" + "Managed Direct" > mode of migration. "Hypervisor Native" here referring to VMM(ch) being > responsible for data flow. This in contrast to TUNNELED migration where data > is sent over libvirt rpc. Avoiding TUNNELLED migration is a very good idea. This was a short term hack to workaround the lack of TLS support in QEMU. It is more efficient to have TLS natively integrated in the hypervisor layer than libvirt. IOW, "Hypervisor native" is a good choice. > > "Managed Direct" referring to virsh client responsible for control flow > between source and dest hosts. The libvirtd daemons on source and > destination do not have to communicate with each other. These modes are > described further at > https://libvirt.org/migration.html#network-data-transports. I'd caution that I think 'managed direct' migration leaves you with fewer nice options for ensuring resilience of the migration. IOW, if the client application goes away, I think it'll be harder for the libvirt CH driver to recover from that scenario. Also if a client app is using the DigitalOcean 'go-libvirt' API instead of our 'libvit-go-module' API, things are even more limited since thg 'go-libvirt' API directly speaks to the RPC protocol, bypassing libvirt.so logic related to migration process steps. With the peer-to-peer mode, migration can carry on even if the client app goes away, since the client app isn't a part of the control loop. So overall, I'd encourage peer-to-peer migration as the preferrable option, unless you can hand-off absolutely everything to the CH code and not have libvirt involved in orchestrating the migration steps at all ? > At the moment, Cloud-Hypervisor supports receiving migration data only on > Unix Domain Sockets. Also, Cloud-Hypervisor does not encrypt the VM data > while sending. Hmm, that's quite limiting. > > We are considering forking "socat" processes as documented at https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/live_migration.md. > The socat processes will be forked in "Prepare" and "Perform" phases on > Destination and Source hosts respectively. > > I couldn't find any existing implementation in libvirt to connect Domain > Sockets on different hosts. Please let me know, if you'd recommend a > different approach from forking socat processes to connect Domain Sockets on > source and dest hosts to allow Live VM Migration. I think building something around socat will get you going quickly, but ultimately be harmful over the long term. Our experiance with QEMU has been that to maximise performance you need the lowest level in full control. These days QEMU can open multiple TCP connections concurrently from multiple, so that throughput isn't limited by data copy performance of a single CPU. It also has ability to take advantage of kernel features like zerocopy. Use of an socat proxy is going to add many data copies to the transport which can only harm your performance. So my recommendation would be to invest time in first extending CH so that it natively supports opening TCP connections, and then take advantage of that in libvirt from the start. You then have the basic foundation right on which to add stuff like TLS, zerocopy, multi-conection, and more With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|