On Thu, Apr 23, 2015 at 11:01:44AM -0400, Laine Stump wrote: > On 04/23/2015 04:34 AM, Chen Fan wrote: > > > > On 04/20/2015 06:29 AM, Laine Stump wrote: > >> On 04/17/2015 04:53 AM, Chen Fan wrote: > >>> - on destination side, check whether need to hotplug new NIC > >>> according to specified XML. > >>> usually, we use migrate "--xml" command option to specify the > >>> destination host NIC mac > >>> address to hotplug a new NIC, because source side passthrough > >>> NIC mac address is different, > >>> then hotplug the deivce according to the destination XML > >>> configuration. > > >> Why does the MAC address need to be different? Are you suggesting doing > >> this with passed-through non-SRIOV NICs? An SRIOV virtual function gets > >> its MAC address from the libvirt config, so it's very simple to use the > >> same MAC address across the migration. Any network card that would be > >> able to do this on any sort of useful scale will be SRIOV-capable (or > >> should be replaced with one that is - some of them are not that > >> expensive). > > > Hi Laine, > > > > I think SRIOV virtual NIC to support migration is good idea, > > but I also think some passthrough NIC without SRIOV-capable. for > > these NIC devices we only able to use <hostdev> to specify the > > passthrough > > function, so for these NIC I think we should support too. > > As I think you've already discovered, passing through non-SRIOV NICS is > problematic. It is completely impossible for the host to change their > MAC address before assigning them to the guest - the guest's driver sees > standard netdev hardware and resets it, which resets the MAC address to > the original value burned into the firmware. This makes management more > complicated, especially when you get into scenarios such as what we're > discussing (i.e. migration) where the actual hardware (and thus MAC > address) may be different from one run to the next. Right, passing through PFs is also insecure. Let's get everything working fine with VFs first, worry about PFs later. > Since libvirt's <interface> element requires a fixed MAC address in the > XML, it's not possible to have an <interface> that gets the actual > device from a network pool (without some serious hacking to that code), > and there is no support for plain (non-network) <hostdev> device pools; > there would need to be a separate (nonexistent) driver for that. Since > the <hostdev> element relies on the PCI address of the device (in the > <source> subelement, which also must be fixed) to determine which device > to passthrough, a domain config with a <hostdev> that could be run on > two different machines would require the device to reside at exactly the > same PCI address on both machines, which is a very serious limitation to > have in an environment large enough that migrating domains is a requirement. > > Also, non-SRIOV NICs are limited to a single device per physical port, > meaning probably at most 4 devices per physical host PCIe slot, and this > results in a greatly reduced density on the host (and even more so on > the switch that connects to the host!) compared to even the old Intel > 82576 cards, which have 14 VFs (7VFs x 2 ethernet ports). Think about it > - with an 82576, you can get 14 guests into 1 PCIe slot and 2 switch > ports, while the same number of guests with non-SRIOV would take 4 PCIe > slots and 14(!) switch ports. The difference is even more striking when > comparing to chips like the 82599 (64 VFs per port x 2), or a Mellanox > (also 64?) or SolarFlare (128?) card. And don't forget that, because you > don't have pools of devices to be automatically chosen from, that each > guest domain that will be migrated requires a reserved NIC on *every* > machine it will be migrated to (no other domain can be configured to use > that NIC, in order to avoid conflicts). > > Of course you could complicate the software by adding a driver that > manages pools of generic hostdevs, and coordinates MAC address changes > with the guest (part of what you're suggesting), but all that extra > complexity not only takes a lot of time and effort to develop, it also > creates more code that needs to be maintained and tested for regressions > at each release. > > The alternative is to just spend $130 per host for an 82576 or Intel > I350 card (these are the cheapest SRIOV options I'm aware of). When > compared to the total cost of any hardware installation large enough to > support migration and have performance requirements high enough that NIC > passthrough is needed, this is a trivial amount. > > I guess the bottom line of all this is that (in my opinion, of course > :-) supporting useful migration of domains that used passed-through > non-SRIOV NICs would be an interesting experiment, but I don't see much > utility to it, other than "scratching an intellectual itch", and I'm > concerned that it would create more long term maintenance cost than it > was worth. I'm not sure it has no utility but it's easy to agree that VFs are more important, and focusing on this first is a good idea. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list