Re: <interface type='hostdev'>vf configuration cleanup when VM is delete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/16/2015 07:56 AM, Moshe Levi wrote:

 

To clean up the VF I use

ip link set dev p4p2 vf 0 mac 0 and it working


Now *that* is interesting...

 

24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000

    link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff

    vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 2 MAC 00:00:00:00:00:b1, vlan 190, spoof checking off, link-state enable

    vf 3 MAC aa:bb:cc:00:00:12, vlan 190, spoof checking off, link-state enable

[root@r-ufm160 devstack]# ip link set dev enp3s0f0 vf 3 mac 0

[root@r-ufm160 devstack]# ip link show enp3s0f0

24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000

    link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff

    vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 2 MAC 00:00:00:00:00:b1, vlan 190, spoof checking off, link-state enable

    vf 3 MAC 00:00:00:00:00:b1, vlan 190, spoof checking off, link-state enable

 

It just put the address 00:00:00:00:00:b1 which I don’t know why, but as I remember the same behavior is in intel cards (I think is related to iproute)


I just tried this with the igb driver on both 2.6.32 and 4.1 kernels, and a plain "0" is successful for me too. But, as you've experienced, it doesn't actually set the MAC address to 00:00:00:00:00:00, but instead puts random numbers in the final two bytes :-/

So I investigated further, and found that if I use:

  ip link set dev p4p2 vf 0 mac 00:00:00:00:00 <-- note 5 bytes, not 6

then all bytes except the *final* byte are 0, and the final byte is two seemingly random bytes. But if I re-run the same command many times I find that it just rotates between 10 or so different values; not so random (when I give "0", or "00:00:00:00" to ip link set, the 2nd to last byte is always the *exact same* value.

So I looked in the source for the ip utility (in the iproute package) and I found that the function parsing mac addresses from the commandline just creates the buffer on the stack, doesn't initialize it, then parses in as many digits as you specify, leaving the rest with whatever happened to be sitting on the stack at the time :-O.

In other words, it's just a happy coincidence of a bug in iproute's mac address parser that "ip link set .... mac 0" happens to be successful (and that bytes 2-4 are 0 and 5-6 are non-0).

I really don't know where to start / what to do with this information. There is obviously a bug in iproute that should be fixed, but if it is fixed before all the places in the kernel are adjusted to allow an all-0 MAC, then users will be complaining that their script which was working for years and years (although probably not doing exactly what they believed) is suddenly broken. And who knows what Hell-fury will be unleashed by some unknown bit of code in the kernel if a 0 mac address suddenly shows up for the first time ever. Sigh.

(BTW, Cisco's enic driver, on the other hand, doesn't support setting VF MAC addresses via a netlink message to the PF *at all* (so libvirt has to make special accommodations), but it happily accepts requests to directly set the MAC address to 00:00:00:00:00:00 via ioctl(SIOCSIFHWADDR) (and the interface MAC address really does get set to all 0's). There is a script for ovirt that uses a MAC address of all 0's to recognize that an interface is unused, and can thus be included in a pool of interfaces in a libvirt network. That won't work with any other SRIOV drivers though, because even if they initialize their VF macs to 0 (e.g. mlx and *new* (3.10+) igb (but *not* 2.6.32 igb!)), they can't be set back to 0 when they are once again unused. Again sigh.)

 

 

I used fedora 2.1 with kernel 4.1.13-100.

 

The most annoying part is that in OpenStack  if I use an SR-IOV VF (interface hostdev) for VM and delete it I can’t reuse it for macvtap (interface direct) so I have to clean the mac

by running ip link set dev p4p2 vf 0 mac 0

 

 

I guess I will need to workaround it in OpenStack.

 

 

From: sendmail [mailto:justsendmailnothingelse@xxxxxxxxx] On Behalf Of Laine Stump
Sent: Tuesday, December 15, 2015 9:45 PM
To: Libvirt <libvir-list@xxxxxxxxxx>
Cc: Moshe Levi <moshele@xxxxxxxxxxxx>; vyasevic@xxxxxxxxxx
Subject: Re: <interface type='hostdev'>vf configuration cleanup when VM is delete

 

On 12/15/2015 01:34 PM, Laine Stump wrote:

On 12/13/2015 10:51 AM, Moshe Levi wrote:

Hi,

 

I have a setup with libvirt 1.3.0 and OpenStack trunk.

Before launched the VM ip link command show the following VF mac/vlan configuration [1]

When I launch a VM with <interface type='hostdev'> via openstack api (OpenStack direct port)

I can see that the VF get the mac/vlan according to libvrit xml [2] and ip link command  [3], but when I delete the VM the mac/vlan config are still shown as in [3] and not restored to [1]

Shouldn’t  libvirt restore the mac/vlan to [1].

 

The same problem exists when using <interface type='direct'> (OpenStack macvtap port)  but just for the MAC configuration of the VF.


What libvirt does is to restore the MAC address to whatever it was before we set it up for use with a guest. Although there are some sriov net drivers that (for some unfathomable reason) think it's cool to assign a random MAC address to each VF at boot time, the "normal" thing is for the VFs to have a MAC address of all 0's to start with. So libvirt should be saving 00:00:00:00:00:00 (it will be in the file /var/run/libvirt/hostdevmgr/$ifname_vf$vfnum) then setting the MAC to use; when done, libvirt will read the 00:00:00:00:00:00 and use netlink to set the MAC address, but this is apparently failing.

I checked on my Fedora 22 system with the igb driver, and found that if the MAC address was originally set to something other than 0's, it was restored properly by libvirt, but if it was set to all 0's originally, the attempt to set it back to 0 would fail.

I then tried doing the same thing with the "ip" utility:

    # ip link set dev p4p2 vf 0 mac 00:00:00:00:00:00

and I get the following response:

    RTNETLINK answers: Invalid argument

So it appears that either the kernel or the NIC driver is refusing to set the MAC address to all 0's. I'm reasonably certain this is a regression in the kernel,


Sigh. It appears that this has "always" been the case - I just checked on a 2.6.32-573 RHEL kernel, and a 3.10.x RHEL7.2 kernel, and 4.1 (Fedora 22) and both of them also refuse to set the MAC address to 00:00:00:00:00:00. I'm not sure if this limitation is in the NIC driver or some basic code in the kernel.



although I can't say how long it's been there, as I don't normally pay attention to this (and as I said, many SRIOV NIC drivers don't default their VFs to 0 MAC addresses)

What distro and kernel are you using for your tests?



 

 

 

[1]  - 24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000

    link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff

    vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 2 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 3 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

 

[2] - <interface type='hostdev' managed='yes'>                                                         

  <mac address=' fa:16:3e:11:af:fe '/>                                                              

  <driver name='kvm'/>                                                                           

  <source>                                                                                       

    <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x7'/>                  

  </source>                                                                                      

  <vlan>                                                                                          

    <tag id='190'/>                                                                              

  </vlan>

  <alias name='hostdev0'/>

  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>

</interface>

 

 

[3] 24: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000

    link/ether e4:1d:2d:a5:f1:22 brd ff:ff:ff:ff:ff:ff

    vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 2 MAC 00:00:00:00:00:00, spoof checking off, link-state auto

    vf 3 MAC fa:16:3e:11:af:fe, vlan 190, spoof checking off, link-state enable

 




--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

F15



--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

 


--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]