On Wed, Jan 22, 2020 at 02:59:36PM +0200, Dan Kenigsberg wrote: > On Tue, Jan 21, 2020 at 4:47 PM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > > > > On Tue, Jan 21, 2020 at 12:46:38PM +0200, Dan Kenigsberg wrote: > > > On Mon, Jan 20, 2020 at 8:33 PM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > > > > > > > > On Sun, Jan 19, 2020 at 10:24:19PM -0500, Laine Stump wrote: > > > > > Current virtio-net drivers that support the failover feature match up > > > > > the virtio backup device with its corresponding hostdev device by > > > > > looking for an interface with a matching MAC address. Since libvirt > > > > > will assign a different random MAC address to each interface that > > > > > isn't given an explicit MAC address, this means that the configuration > > > > > for any failover pairs must include explicit matching MAC addresses. > > > > > > > > > > To make life easier, we could have libvirt populate the XML config > > > > > with the same auto-generated MAC address for both interfaces when it > > > > > detects a failover pair that have no MAC addresses provided (a > > > > > failover pair can be detected by matching <alias name='blah'/> of the > > > > > virtio interface with <driver backupAlias='blah'/> of the hostdev > > > > > interface), but then we would be stuck with that behavior even if the > > > > > virtio guest driver later eliminated the requirement that mac > > > > > addresses match. > > > > > > > > > > Additionally, some management software uses the MAC address as the > > > > > primary index for its list of network devices, and can't properly deal > > > > > with two interfaces having the same MAC address (oVirt). Even > > > > > libvirt's own virsh utility uses MAC address (combined with interface > > > > > type) to uniquely identify interfaces for the virsh detach-interface > > > > > command (in this case, fortunately the runtime interface type is used, > > > > > so one of the interfaces will always be of type='hostdev' and the > > > > > other type='something-else", so it doesn't currently cause any problem). > > > > > > > > > > In order to remove the necessity of explicitly setting interface MAC > > > > > addresses, as well as permit the two interfaces of a failover pair to > > > > > each have a unique index while still fulfilling the current guest > > > > > driver requirement that the MAC addresses matchin the guest, this > > > > > patch adds a new attribute "useBackupMAC" that is set on the hostdev > > > > > interface of the pair. When useBackupMAC='yes', the setup for the > > > > > hostdev interface will find the virtio failover interface (using > > > > > backupAlias) and use that interface's MAC address to initialize the > > > > > MAC address of the hostdev interface; the MAC address in the hostdev > > > > > interface config remains unchanged, it just isnt used for device > > > > > initialization. > > > > > > > > > > I made this patch to followup on > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1693587#c12 (where I > > > > > suggested this attribute as a possible remedy to oVirt's requirement > > > > > that each network device have a unique MAC address). > > > > > > > > > > Truthfully, I'm not convinced that I want it though, as it seems > > > > > "a bit" hackish. In particular, I've thought for a long time that the > > > > > "hostdev manager" code in util/virhostdev.c should really live in the > > > > > node device driver and be called from the hypervisors via a public API > > > > > (so that there is one central place on the host that maintains the > > > > > list of in-use PCI devices and their status), but this patch adds an > > > > > obstacle to that goal by adding a virDomainDefPtr to more of the APIs in > > > > > that library - if this was turned into a public API, then entire > > > > > virDomainDef would need to be serialized and sent in the API call, > > > > > then parsed at the other end - yuck :-/. NB: there are already > > > > > functions in virhostdev.h that take a virDomainDefPtr, so maybe I'm > > > > > being too sensitive. > > > > > > > > > > On the upside, it solves a problem, and default bevahior is unchanged. > > > > > > > > I don't believe it does solve a real problem. > > > > > > > > If a mgmt app is capable of setting useBackupMAC=yes when writing the > > > > XML doc, then I don't see why it cannot just as easily set the matching > > > > MAC address when wring the XML doc. > > > > > > > > It can still treat both NICs as having a different MAC address in its > > > > own internal code. All it has to do is use the matching MAC address > > > > when writing out the XML config it gives to libvirt. > > > > > > > > I know oVirt has a facility for hook scripts that are add-ons which > > > > can arbitrarily munge the XML that VDSM creates. So AFAICT this doesn't > > > > even need to involve changes to VDSM itself. There can be a hook > > > > script which looks for the config indicating a failover pair, and > > > > rewrites the XML to have the matching MAC addr. > > > > > > > > Such a workaround then only needs to exist for as long as the mgmt > > > > app has this problematic limitation without impacting libvirt's > > > > maint. > > > > > > > > So I don't want to take this application specific hack in libvirt. > > > > > > I see why you can see it as a hack that should not exist in libvirt. > > > However I would try to put out my case for it. This feature creates > > > something very similar to an in-guest bond between an sriov interface > > > to a virtio one. Currently, we ask end users to configure the bond, > > > set the sriov interface as the primary interface, and allow > > > mac-spoofing on the virio interface. To me, the purpose of this > > > feature is to remove the need for end-user intervention. The bond > > > device no longer need to be created in the guest, it can be configured > > > by management on the host side. Defining bonds in Linux is an > > > established procedure. You select pre-existing interfaces, each with > > > its different mac address, pick up a bonding mode, pick up a master > > > interface, pick up the active mac address of the bond and start using > > > it. I'd like to have the same experience when I configure this new > > > type of bonding via libvirt. It just feels right to have a couple of > > > independent interfaces, bonded together, with one selected as > > > "master". > > > > This is comparing apples & oranges IMHO. It is comparing what is done > > as an end user in the guest with what is done as an internal config > > option in libvirt. If you want to compare the user experiance, then > > the comparison is between the guest setup and the RHEV user interface > > experiance. AFAICT, we can achieve the desired user experiance in RHEV > > and there's no functional reason to need this patch. It is just adding > > a policy control knob that doesn't have any impact on what it is possible > > to configure for the guest, while adding to maint burden of libvirt. > > IMHO there are three layers of apples here. > In oVirt or KubeVirt, maybe also in OpenStack, a VM owner would like > to define to virtual interfaces. One that is primary, based on SR-IOV. > The other is secondary, connected via virtio to a logical network to > their choosing. They would to specify the second interface as a backup > for the first one. They don't really care about the mac address of the > interfaces. > In the guest, oVirt users can currently do just this using normal Linux bonding. > IMHO a user of virsh or virt-manager would like to do that, too. He or > she should not bother setting the mac address of the two interfaces; > no human like to do that. They rather state which interface is the > backup of another one. > > I think that what you see as a maintenance burden stems from a design > mistake^Wdecision in virtio, which expresses the "x is the backup of > y" as "x and y have the same mac address and x is virtio". I > (egotistically?) think of libvirt as the human-accessible API for > virtualization, that let me express what I want, with little leakage > of implementation issues. I think this is where the disconnect is. The libvirt API is not aiming to provide human targetted convenience. It is intending to provide a machine targetted functional mechanism with clear semantics, on which applications can construct human targetted virtualization solutions. Human convenience entails making a large number of usage policy decisions based on criteria that are appropriate for the application's use cases & knowledge of the guest OS needs. > I'll try to express the user experience I envisage in KubeVirt terms: > > kind: VM > spec: > domain: > devices: > interfaces: > - name: fast > sriov: {} > - name: backup > bridge: {} > backupOf: fast # <---- how I'd like to express the > relation between the two nics > networks: > - name: fast > multus: > networkName: sriov-vlan-100 > - name: backup > multus: > networkName: ovs-vlan-100 Yes, this is a perfectly sane way to expose teaming in KubeVirt. What you're showing here is the KubeVirt config though, and this will be converted into a libvirt guest XML config, at which time KubeVirt can choose to provide a matching MAC address for the two NICs it has requested. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|