Re: [libvirt PATCH 12/12] conf/qemu: new <driver> attribute "useBackupMAC"

Daniel P. Berrangé <berrange@xxxxxxxxxx> · Wed, 5 Feb 2020 11:47:58 +0000

On Wed, Jan 22, 2020 at 02:59:36PM +0200, Dan Kenigsberg wrote:
> On Tue, Jan 21, 2020 at 4:47 PM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:
> >
> > On Tue, Jan 21, 2020 at 12:46:38PM +0200, Dan Kenigsberg wrote:
> > > On Mon, Jan 20, 2020 at 8:33 PM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:
> > > >
> > > > On Sun, Jan 19, 2020 at 10:24:19PM -0500, Laine Stump wrote:
> > > > > Current virtio-net drivers that support the failover feature match up
> > > > > the virtio backup device with its corresponding hostdev device by
> > > > > looking for an interface with a matching MAC address. Since libvirt
> > > > > will assign a different random MAC address to each interface that
> > > > > isn't given an explicit MAC address, this means that the configuration
> > > > > for any failover pairs must include explicit matching MAC addresses.
> > > > >
> > > > > To make life easier, we could have libvirt populate the XML config
> > > > > with the same auto-generated MAC address for both interfaces when it
> > > > > detects a failover pair that have no MAC addresses provided (a
> > > > > failover pair can be detected by matching <alias name='blah'/> of the
> > > > > virtio interface with <driver backupAlias='blah'/> of the hostdev
> > > > > interface), but then we would be stuck with that behavior even if the
> > > > > virtio guest driver later eliminated the requirement that mac
> > > > > addresses match.
> > > > >
> > > > > Additionally, some management software uses the MAC address as the
> > > > > primary index for its list of network devices, and can't properly deal
> > > > > with two interfaces having the same MAC address (oVirt). Even
> > > > > libvirt's own virsh utility uses MAC address (combined with interface
> > > > > type) to uniquely identify interfaces for the virsh detach-interface
> > > > > command (in this case, fortunately the runtime interface type is used,
> > > > > so one of the interfaces will always be of type='hostdev' and the
> > > > > other type='something-else", so it doesn't currently cause any problem).
> > > > >
> > > > > In order to remove the necessity of explicitly setting interface MAC
> > > > > addresses, as well as permit the two interfaces of a failover pair to
> > > > > each have a unique index while still fulfilling the current guest
> > > > > driver requirement that the MAC addresses matchin the guest, this
> > > > > patch adds a new attribute "useBackupMAC" that is set on the hostdev
> > > > > interface of the pair. When useBackupMAC='yes', the setup for the
> > > > > hostdev interface will find the virtio failover interface (using
> > > > > backupAlias) and use that interface's MAC address to initialize the
> > > > > MAC address of the hostdev interface; the MAC address in the hostdev
> > > > > interface config remains unchanged, it just isnt used for device
> > > > > initialization.
> > > > >
> > > > > I made this patch to followup on
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1693587#c12 (where I
> > > > > suggested this attribute as a possible remedy to oVirt's requirement
> > > > > that each network device have a unique MAC address).
> > > > >
> > > > > Truthfully, I'm not convinced that I want it though, as it seems
> > > > > "a bit" hackish. In particular, I've thought for a long time that the
> > > > > "hostdev manager" code in util/virhostdev.c should really live in the
> > > > > node device driver and be called from the hypervisors via a public API
> > > > > (so that there is one central place on the host that maintains the
> > > > > list of in-use PCI devices and their status), but this patch adds an
> > > > > obstacle to that goal by adding a virDomainDefPtr to more of the APIs in
> > > > > that library - if this was turned into a public API, then entire
> > > > > virDomainDef would need to be serialized and sent in the API call,
> > > > > then parsed at the other end - yuck :-/.  NB: there are already
> > > > > functions in virhostdev.h that take a virDomainDefPtr, so maybe I'm
> > > > > being too sensitive.
> > > > >
> > > > > On the upside, it solves a problem, and default bevahior is unchanged.
> > > >
> > > > I don't believe it does solve a real problem.
> > > >
> > > > If a mgmt app is capable of setting useBackupMAC=yes when writing the
> > > > XML doc, then I don't see why it cannot just as easily set the matching
> > > > MAC address when wring the XML doc.
> > > >
> > > > It can still treat both NICs as having a different MAC address in its
> > > > own internal code. All it has to do is use the matching MAC address
> > > > when writing out the XML config it gives to libvirt.
> > > >
> > > > I know oVirt has a facility for hook scripts that are add-ons which
> > > > can arbitrarily munge the XML that VDSM creates. So AFAICT this doesn't
> > > > even need to involve changes to VDSM itself. There can be a hook
> > > > script which looks for the config indicating a failover pair, and
> > > > rewrites the XML to have the matching MAC addr.
> > > >
> > > > Such a workaround then only needs to exist for as long as the mgmt
> > > > app has this problematic limitation without impacting libvirt's
> > > > maint.
> > > >
> > > > So I don't want to take this application specific hack in libvirt.
> > >
> > > I see why you can see it as a hack that should not exist in libvirt.
> > > However I would try to put out my case for it. This feature creates
> > > something very similar to an in-guest bond between an sriov interface
> > > to a virtio one. Currently, we ask end users to configure the bond,
> > > set the sriov interface as the primary interface, and allow
> > > mac-spoofing on the virio interface. To me, the purpose of this
> > > feature is to remove the need for end-user intervention. The bond
> > > device no longer need to be created in the guest, it can be configured
> > > by management on the host side. Defining bonds in Linux is an
> > > established procedure. You select pre-existing interfaces, each with
> > > its different mac address, pick up a bonding mode, pick up a master
> > > interface, pick up the active mac address of the bond and start using
> > > it. I'd like to have the same experience when I configure this new
> > > type of bonding via libvirt. It just feels right to have a couple of
> > > independent interfaces, bonded together, with one selected as
> > > "master".
> >
> > This is comparing apples & oranges IMHO. It is comparing what is done
> > as an end user in the guest with what is done as an internal config
> > option in libvirt. If you want to compare the user experiance, then
> > the comparison is between the guest setup and the RHEV user interface
> > experiance. AFAICT, we can achieve the desired user experiance in RHEV
> > and there's no functional reason to need this patch. It is just adding
> > a policy control knob that doesn't have any impact on what it is possible
> > to configure for the guest, while adding to maint burden of libvirt.
> 
> IMHO there are three layers of apples here.
> In oVirt or KubeVirt, maybe also in OpenStack, a VM owner would like
> to define to virtual interfaces. One that is primary, based on SR-IOV.
> The other is secondary, connected via virtio to a logical network to
> their choosing. They would to specify the second interface as a backup
> for the first one. They don't really care about the mac address of the
> interfaces.
> In the guest, oVirt users can currently do just this using normal Linux bonding.
> IMHO a user of virsh or virt-manager would like to do that, too. He or
> she should not bother setting the mac address of the two interfaces;
> no human like to do that. They rather state which interface is the
> backup of another one.
> 
> I think that what you see as a maintenance burden stems from a design
> mistake^Wdecision in virtio, which expresses the "x is the backup of
> y" as "x and y have the same mac address and x is virtio". I
> (egotistically?) think of libvirt as the human-accessible API for
> virtualization, that let me express what I want, with little leakage
> of implementation issues.

I think this is where the disconnect is. The libvirt API is not aiming
to provide human targetted convenience. It is intending to provide a
machine targetted functional mechanism with clear semantics, on which 
applications can construct human targetted virtualization solutions. 
Human convenience entails making a large number of usage policy 
decisions based on criteria that are appropriate for the application's
use cases & knowledge of the guest OS needs. 

> I'll try to express the user experience I envisage in KubeVirt terms:
> 
> kind: VM
> spec:
>   domain:
>     devices:
>       interfaces:
>         - name: fast
>           sriov: {}
>         - name: backup
>           bridge: {}
>           backupOf: fast   # <---- how I'd like to express the
> relation between the two nics
>   networks:
>   - name: fast
>     multus:
>       networkName: sriov-vlan-100
>   - name: backup
>     multus:
>       networkName: ovs-vlan-100

Yes, this is a perfectly sane way to expose teaming in KubeVirt.

What you're showing here is the KubeVirt config though, and this
will be converted into a libvirt guest XML config, at which time
KubeVirt can choose to provide a matching MAC address for the two 
NICs it has requested.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|