On Tue, 31 Oct 2023 17:33:50 +0200 Juhani Rautiainen <jrauti@xxxxxx> wrote: > Hi! > > I noticed a change in my home server which breaks some of KVM VM's > with newer kernels. I have two Intel I350 cards: one with two ports > and another with four ports. I have been using the card with two ports > in a firewall VM with vfio-pci. Other ports have been given to other > VM's as host interface devices in KVM. When I upgraded to 6.6 I > noticed that the four port card is now using vfio-pci driver and not > igb as with 6.4.8 did and those VM's using host interfaces didn't > start. I had earlier built 6.5.5 so I tried that and it works same way > as the 6.6 kernel does, so if something has changed it is probably in > 6.5 series. I have this in /etc/modprope.d/vfio.conf: > > options vfio_pci ids=8086:1521 > > With 6.4.8 lspci -vv shows this: > 01:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network > Connection (rev 01) > Subsystem: Intel Corporation Ethernet Server Adapter I350-T4 > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 67 > IOMMU group: 11 > .... > Kernel driver in use: igb > Kernel modules: igb > > And with 6.5.5 I get: > 01:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network > Connection (rev 01) > Subsystem: Intel Corporation Ethernet Server Adapter I350-T4 > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- > ParErr- Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Interrupt: pin A routed to IRQ 255 > IOMMU group: 11 > .... > Kernel driver in use: vfio-pci > Kernel modules: igb > > Have I been just lucky previously with my config or did something > change? I tried to figure out the change from 6.5 release notes but > could not. My home server is running on AMD Ryzen 5700g and Alma Linux > 8.8 (I just compile newer kernels out of habit). The more curious part to me is how your configuration managed to have some NICs attached to igb and some attached to vfio-pci. With the modprobe.d directive, vfio-pci will try to bind to all matching devices that aren't already bound to a driver. If igb loads first, all the devices would bind to igb. If vfio-pci loads first, all the devices bind to vfio-pci (do some have a different device ID?). The vfio-pci module wouldn't get loaded without something somewhere else requesting it, so typically igb would claim everything. Do you launch your VMs with libvirt, which might have automatically bound the devices to vfio-pci and now there's something loading the vfio-pci module before igb? The driverctl tool might be useful for you to specify a specific driver for specific devices. Otherwise I'm not sure what kernel change might have triggered this behavioral change without knowing more about how and when the vfio-pci module is loaded relative to the igb module. Thanks, Alex