Hi,
I'm on a Dell VEP 1405 running Debian 9.11 and I'm running a few tests with various interfaces given in PCI passthrough to a qemu/KVM Virtual Machine also running Debian 9.11.
I noticed that only one of the four I350 network controllers can be used in PCI passthrough. The available interfaces are:
# dpdk-devbind.py --status
Network devices using kernel driver
===================================
0000:02:00.0 'I350 Gigabit Network Connection 1521' if=eth2 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:02:00.1 'I350 Gigabit Network Connection 1521' if=eth3 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:02:00.2 'I350 Gigabit Network Connection 1521' if=eth0 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:02:00.3 'I350 Gigabit Network Connection 1521' if=eth1 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic
0000:04:00.0 'QCA986x/988x 802.11ac Wireless Network Adapter 003c' if= drv=ath10k_pci unused=igb_uio,vfio-pci,uio_pci_generic
0000:05:00.0 'Device 15c4' if=eth7 drv=ixgbe unused=igb_uio,vfio-pci,uio_pci_generic
0000:05:00.1 'Device 15c4' if=eth6 drv=ixgbe unused=igb_uio,vfio-pci,uio_pci_generic
0000:07:00.0 'Device 15e5' if=eth5 drv=ixgbe unused=igb_uio,vfio-pci,uio_pci_generic
0000:07:00.1 'Device 15e5' if=eth4 drv=ixgbe unused=igb_uio,vfio-pci,uio_pci_generic
If I try PCI passthrough on 02:00.2 (eth0), it works fine. With any of the remaining three interfaces, libvirt fails with this error:
# virsh create vnf.xml
error: Failed to create domain from vnf.xml
error: internal error: process exited while connecting to monitor: 2020-04-06T16:08:47.048266Z qemu-system-x86_64: -device vfio-pci,host=02:00.1,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:02:00.1: failed to setup INTx fd: Operation not permittedThe contents of vnf.xml are available here: https://pastebin.com/rT3RmAi5
This is what happened in dmesg when I tried to start the VM:
[ 7305.371730] igb 0000:02:00.1: removed PHC on eth3
[ 7307.085618] ACPI Warning: \_SB.PCI0.PEX2._PRT: Return Package has no elements (empty) (20160831/nsprepkg-130)
[ 7307.085717] pcieport 0000:00:0b.0: can't derive routing for PCI INT B
[ 7307.085719] vfio-pci 0000:02:00.1: PCI INT B: no GSI
[ 7307.369611] igb 0000:02:00.1: enabling device (0400 -> 0402)
[ 7307.369668] ACPI Warning: \_SB.PCI0.PEX2._PRT: Return Package has no elements (empty) (20160831/nsprepkg-130)
[ 7307.369764] pcieport 0000:00:0b.0: can't derive routing for PCI INT B
[ 7307.369766] igb 0000:02:00.1: PCI INT B: no GSI
[ 7307.426266] igb 0000:02:00.1: added PHC on eth3
[ 7307.426269] igb 0000:02:00.1: Intel(R) Gigabit Ethernet Network Connection
[ 7307.426271] igb 0000:02:00.1: eth3: (PCIe:5.0Gb/s:Width x2) 50:9a:4c:ee:9f:b1
[ 7307.426350] igb 0000:02:00.1: eth3: PBA No: 106300-000
[ 7307.426352] igb 0000:02:00.1: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
These are all the messages related to that device in dmesg before I tried to start the VM:
# dmesg | grep 02:00.1
[ 0.185301] pci 0000:02:00.1: [8086:1521] type 00 class 0x020000
[ 0.185317] pci 0000:02:00.1: reg 0x10: [mem 0xdfd40000-0xdfd5ffff]
[ 0.185334] pci 0000:02:00.1: reg 0x18: [io 0xd040-0xd05f]
[ 0.185343] pci 0000:02:00.1: reg 0x1c: [mem 0xdfd88000-0xdfd8bfff]
[ 0.185434] pci 0000:02:00.1: PME# supported from D0 D3hot D3cold
[ 0.185464] pci 0000:02:00.1: reg 0x184: [mem 0xdeea0000-0xdeea3fff 64bit pref]
[ 0.185467] pci 0000:02:00.1: VF(n) BAR0 space: [mem 0xdeea0000-0xdeebffff 64bit pref] (contains BAR0 for 8 VFs)
[ 0.185486] pci 0000:02:00.1: reg 0x190: [mem 0xdee80000-0xdee83fff 64bit pref]
[ 0.185488] pci 0000:02:00.1: VF(n) BAR3 space: [mem 0xdee80000-0xdee9ffff 64bit pref] (contains BAR3 for 8 VFs)
[ 0.334021] DMAR: Hardware identity mapping for device 0000:02:00.1
[ 0.334463] iommu: Adding device 0000:02:00.1 to group 16
[ 0.398809] pci 0000:02:00.1: Signaling PME through PCIe PME interrupt
[ 2.588049] igb 0000:02:00.1: PCI INT B: not connected
[ 2.643900] igb 0000:02:00.1: added PHC on eth1
[ 2.643903] igb 0000:02:00.1: Intel(R) Gigabit Ethernet Network Connection
[ 2.643905] igb 0000:02:00.1: eth1: (PCIe:5.0Gb/s:Width x2) 50:9a:4c:ee:9f:b1
[ 2.643984] igb 0000:02:00.1: eth1: PBA No: 106300-000
[ 2.643986] igb 0000:02:00.1: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[ 2.873544] igb 0000:02:00.1 rename3: renamed from eth1
[ 2.939352] igb 0000:02:00.1 eth3: renamed from rename3
In particular this looks suspicious: igb 0000:02:00.1: PCI INT B: not connected
The full dmesg is available here: https://pastebin.com/kPbUAKCi
This is the PCI bus structure:
# lspci -tv
-[0000:00]-+-00.0 Intel Corporation Device 1980
+-04.0 Intel Corporation Device 19a1
+-05.0 Intel Corporation Device 19a2
+-06.0-[01]----00.0 Intel Corporation Device 19e2
+-0b.0-[02-03]--+-00.0 Intel Corporation I350 Gigabit Network Connection
| +-00.1 Intel Corporation I350 Gigabit Network Connection
| +-00.2 Intel Corporation I350 Gigabit Network Connection
| \-00.3 Intel Corporation I350 Gigabit Network Connection
+-0f.0-[04]----00.0 Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter
+-12.0 Intel Corporation DNV SMBus Contoller - Host
+-13.0 Intel Corporation DNV SATA Controller 0
+-15.0 Intel Corporation Device 19d0
+-16.0-[05-06]--+-00.0 Intel Corporation Device 15c4
| \-00.1 Intel Corporation Device 15c4
+-17.0-[07-08]--+-00.0 Intel Corporation Device 15e5
| \-00.1 Intel Corporation Device 15e5
+-18.0 Intel Corporation Device 19d3
+-1c.0 Intel Corporation Device 19db
+-1f.0 Intel Corporation DNV LPC or eSPI
+-1f.2 Intel Corporation Device 19de
+-1f.4 Intel Corporation DNV SMBus controller
\-1f.5 Intel Corporation DNV SPI Controller
By looking at lspci -v, there's something going on with the IRQ field exactly in three devices I can't use in PCI passthrough ("IRQ -2147483648"):
# lspci -v|grep -A1 I350
02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
Flags: bus master, fast devsel, latency 0, IRQ -2147483648
--
02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
Flags: bus master, fast devsel, latency 0, IRQ -2147483648
--
02:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
Flags: bus master, fast devsel, latency 0, IRQ 18
--
02:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
Flags: bus master, fast devsel, latency 0, IRQ -2147483648
Finally, every i350 interface has its own IOMMU group in /sys/kernel/iommu_groups/.
The kernel I'm using in the host machine is 4.9.189 and my libvirt version is 4.3.0.
Any thoughts on this?
Is there something I should enable in the BIOS or in the kernel to make this work?
Thanks!
Regards,
Riccardo Ravaioli
So ultimately the problem was somewhere in the BIOS. A BIOS update fixed the issue.
Riccardo
On Tue, 7 Apr 2020 at 18:05, Riccardo Ravaioli <riccardoravaioli@xxxxxxxxx> wrote: