On 01/20/2012 10:50 PM, Laine Stump wrote:
To refresh everyone's memory, the origin of the problem I'm trying to solve here is that the VFs of an SRIOV-capable ethernet card are given new random MAC addresses each time the card is initialized. If those VFs are then passed-through to a guest using the existing <hostdev> config, the guest will see a new MAC address each time the host is restarted, and will thus believe that a new ethernet card has been installed. This can result in anything from a dialog claiming that the guest has connected to a new network (MS products) to a new network device name showing up (Linux - "hmm, eth0 was unplugged, but here's this new device. Let's call it "eth1"!) Several months ago I sent out some mail proposing a scheme for automatically allocating network devices from a pool to be assigned to guests via PCI passthrough: https://www.redhat.com/archives/libvir-list/2011-August/msg00937.html My idea was to have a new <network> forward mode combined with guest <interface> definitions that would end up auto-generating a transient <hostdev> entry in the guest's config (and setting the VF's mac address in the process). Dan Berrange pointed out in that thread that we really do need to have a persistent <hostdev> entry for these devices in the domain xml, if for no other reason than to guarantee that the same guest-side PCI address is always used (thus preventing surprises in the guest, such as re-activation demands from Microsoft OSes). (There were other reasons, but that one was the real "hard stop" for me.) I've come back to this problem, and have decided that, while having the actual host device auto-allocated at runtime would be nice, first implementing a less ambitious solution that uses a hand-picked device would not preclude adding the more complicated/useful functionality later. So, here's a new simpler proposal. Step 1 ------ In the end, the solution will be that the VF's auto-generated random MAC address should be replaced with a fixed MAC address supplied by libvirt prior to assigning the VF to the guest. As a first step to satisfy this basic requirement, I'm figuring to just extend the <hostdev> xml in this way: |<hostdev mode='subsystem' type='pci' managed='yes'> |<source> |<address bus='0x06' slot='0x02' function='0x0'/> |</source> |<mac address='11:22:33:44:55:66"/> |</hostdev>
In view of the discussion on SCSI passthrough, it seems to me that this should be attached to an <interface> element:
<devices> <interface type='hostdev'> <source> <address type='pci' bus='0x06' slot='0x02' function='0x0'/> </source> <mac address='00:16:3e:5d:c7:9e'/> <address type='pci' .../> </interface> </devices>
3) I've seen requests from 2 places to do host-side virtual port association (i.e. vepa / 802.1Qb[gh]). Would it be feasible to do that association with the device after setting MAC address and before assigning it to the guest? (and likewise for the inverse) Or would the act of PCI assignment screw that up somehow? (one of the messages in the earlier thread says something about the device initialization by the guest un-doing necessary setup) (if it would work, a <virtualport> could just be added along with <mac address>).
I know almost nothing about this, but it does sound like another hint that augmenting <interface> is a better plan.
Step 2 ------ Once the basic functionality is in place, a further step would be one just to simplify the admins job - we could do this by replacing this config: | <source> | <address bus='x' slot='y' function='z'/> | </source> with: | <source> | <address netdev='eth22'/> | </source>
<devices> <interface type='hostdev'> <source dev='eth22'/> <address type='pci' .../> </interface> </devices>
To further simplify configuration, it would be very nice if the choice of network device could be done automatically. Since libvirt's networks already have the concept of a pool of devices (and also of portgroups which can be used to set <virtualport> parameters), it kind of makes to sense to use that. In this case, a network would be defined something like this: | <network> | <name>passthrough-net</name> | <forward dev='eth20' mode='hostdev'> <!-- or "hardware" or "device" --> | <interface dev='eth20'/> | <interface dev='eth21'/> | <interface dev='eth22'/> | .. | </forward> | </network> (it could also contain a virtualport definition and/or portgroups containing virtualport definitions. Obviously, we would have to prohibit <bandwidth> elements (and several other things) in the definitions>) Then, in lieu of a pci address or network device name (as "netdev"), the <hostdev>'s <source> would have a reference to the network: |<hostdev mode='subsystem' type='pci' managed='yes'> |<source> |<address network='passthrough-net'/> |</source> |<mac address='11:22:33:44:55:66"/> |</hostdev>
<devices> <interface type='hostdev'> <source network='passthrough-net'/> <mac address='11:22:33:44:55:66"/> <address type='pci' .../> </interface> </devices>
(or, again, maybe use the separate <network> element: "<network name='passthrough-net'/>) At attach time, the pool of devices in passthrough-net would be searched for a free device, and if found, that device would have its MAC address changed and be assigned to the guest. In this case, the live XML would be updated with the pci address information, but when the guest was destroyed, the device would be handed back to the pool, and the pci address info once again removed from the config.
This sounds really nice, especially together with the auto-add VF functionality that was committed recently.
Paolo -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list