On 2/1/22 09:28, Dmitrii Shcherbakov wrote: > SmartNIC DPUs may not expose some privileged eswitch operations > to the hypervisor hosts. For example, this happens with Bluefield > devices running in the ECPF (default) mode [1] for security reasons. While > VF MAC address programming is possible via an RTM_SETLINK operation, > trying to set a VLAN ID in the same operation may fail with EPERM. > > Specifically for the mlx5 driver this behavior was altered in the Linux > kernel upstream [2] to avoid getting EPERM when trying to program VLAN 0. > This may also get backported to older downstream kernels (e.g. [3]). > However, Libvirt could potentially handle this case gracefully without > needing a specific kernel version or depending on a specific driver fix. > > In the kernel a relevant call chain may look like > > do_setlink -> do_setvfinfo -> dev->netdev_ops->set_vf_vlan > > which calls a driver-specific function like [4] eventually. > > The equivalent ip link commands below provide an illustration: > > 1. This will work without an error: > > sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe > > 2. Setting (or clearing) a VLAN will fail with EPERM: > > sudo ip link set enp130s0f0 vf 2 vlan 0 > RTNETLINK answers: Operation not permitted > > 3. This is what Libvirt attempts to do today when trying to clear a > VF VLAN at the same time as programming a VF MAC: > > sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe > RTNETLINK answers: Operation not permitted > > If setting an explicit VLAN ID results in an EPERM, clearing a VLAN > (setting a VLAN ID to 0) can be handled gracefully by ignoring the > EPERM error with the rationale being that if we cannot set this state > in the first place, we cannot clear it either. > > Thus, virNetDevSetVfConfig is split into two distinct functions. If > clearing a VLAN ID fails with EPERM when clearing is implicit, the > error is simply ignored. For explicit clearing EPERM is still a > fatal error. > > Both new functions rely virNetDevSendVfSetLinkRequest that implements > common functionality related to formatting a request, sending it and > handling error conditions and returns 0 or an error since in both cases > the payload is either NLMSG_DONE (no error) or NLMSG_ERROR where an > error message is needed by the caller to handle known cases > appropriately. This function allows the conditional code to be unit tested. > > An alternative to this could be providing a higher level control plane > mechanism that would provide metadata about a device being remotely > managed in which case Libvirt would avoid trying to set or clear a > VLAN ID. This would be more complicated since other software (like Nova > in the OpenStack case) would have to annotate every guest device with an > attribute indicating whether a device is remotely managed or not based > on operator provided configuration so that Libvirt can act on this and > avoid VLAN programming. > > https://gitlab.com/dmitriis/libvirt/-/pipelines/460293963 > > v8 change: > > * Rebased on top of the latest changes to Libvirt; > * Added relevant upstream Linux and downstream kernel references to > this cover letter. > > [1] https://docs.mellanox.com/display/BlueFieldSWv35111601/Modes+of+Operation#ModesofOperation-SmartNICmode > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7846665d3504812acaebf920d1141851379a7f37 > [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957753 > [4] https://github.com/torvalds/linux/blob/v5.15/drivers/net/ethernet/mellanox/mlx5/core/esw/legacy.c#L427-L434 > > Dmitrii Shcherbakov (3): > Set VF MAC and VLAN ID in two different operations > Allow VF vlanid to be passed as a pointer > Ignore EPERM on implicit clearing of VF VLAN ID > > NEWS.rst | 14 ++ > src/hypervisor/virhostdev.c | 4 +- > src/libvirt_private.syms | 7 + > src/util/virnetdev.c | 256 +++++++++++++++++++++++++----------- > src/util/virnetdevpriv.h | 44 +++++++ > tests/virnetdevtest.c | 249 ++++++++++++++++++++++++++++++++++- > 6 files changed, 496 insertions(+), 78 deletions(-) > create mode 100644 src/util/virnetdevpriv.h > Sorry for allowing this go to v8 without proper review. To make it up to you, here's my suggestion: problems I've raised in my review are easy to solve. I've uploaded the 'fixup' commits onto my gilab here: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/vf_vlan_id/ and if you agree, I'd squash them in respective commits and merge. Laine, any thoughts? Michal