On Tue, 9 Jan 2018 22:36:01 +0100 Binarus <lists@xxxxxxxxxx> wrote: > To answer my own message: > > On 09.01.2018 18:58, Binarus wrote: > > > The Seabios boot screen hangs for about a minute or so. Then the OS > > (W2K8 R2 server 64 bit) hangs forever at the first screen which shows > > the progress bar. By booting into safe mode, I have found out that this > > happens when it tries to load the classpnp.sys driver. > > > > In some cases, when starting the VM, there was a message on the console > > saying it was disabling IRQ 16. > > > > This is the point where I am lost (again). > > It seems I have got it to work. I have added the option > "x-no-kvm-intx=on" to the device definition. My command line is now: > > /usr/bin/qemu-system-x86_64 > -machine q35,accel=kvm > -cpu host > -smp cores=2,threads=2,sockets=1 > -rtc base=localtime,clock=host,driftfix=none > -drive file=/vm-image/dax.img,format=raw,if=virtio,cache=writeback,index=0 > -device > ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=2,chassis=1,id=root.1 > -device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,x-no-kvm-intx=on > -boot c > -pidfile /root/qemu-kvm/qemu-dax.pid > -m 12288 > -k de > -daemonize > -usb -usbdevice "tablet" > -name dax > -device virtio-net-pci,vlan=0,mac=02:01:01:01:02:01 > -net > tap,vlan=0,name=dax,ifname=dax0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown > -vnc :2 > > This command line makes the Seabios hang for between 30 and 60 seconds > (it seems the time it takes is not always the same) during the boot > process, but then boots up the W2K8 R2 server without any issue. Within > the VM, I have installed the Marvell Windows drivers for the > controller's chipset. Great! > > And as desired, I can now cleanly "eject" the disks connected to that > controller without leaving the VM, i.e. without visiting the host's console. > > Remaining questions: > > - What could make the Seabios hang for such a long time upon every boot? Perhaps some sort of problem with the device ROM. Assuming you're not booting the VM from the assigned device, you can add rombar=0 to the qemu vfio-pci device options to disable the ROM. I suppose it's possible that SeaBIOS might know how to talk to the device regardless of the ROM, so no guarantees that will resolve it. Setting a bootindex both on the vfio-pci device and the actual boot device could help. I think the '-boot c' option is deprecated, explicitly specifying a emulated controller would be better. virt-install or virt-manager would do this for you. Also, using q35 vs 440fx for the VM machine type makes no difference, q35 is, if anything, more troublesome imo. > - Could you please shortly explain what the option "x-no-kvm-intx=on" > does and why I need it in this case? INTx is the legacy PCI interrupt (ie. INTA, INTB, INTC, INTD). This is a level triggered interrupt therefore it continues to assert until the device is serviced. It must therefore be masked on the host while it is handled by the guest. There are two paths we can use for injecting this interrupt into the VM and unmasking it on the host once the VM samples the interrupt. When KVM is used for acceleration, these happen via direct connection between the vfio-pci and kvm modules using eventfds and irqfds. The x-no-kvm-intx option disables that path, instead bouncing out to QEMU to do the same. TBH, I have no idea why this would make it work. The QEMU path is slower than the KVM path, but they should be functionally identical. > - Could you please shortly explain what exactly it wants to tell me when > it says that it disables INT xx, and notable if this is a bad thing I > should take care of? The "Disabling IRQ XX, nobody cared" message means that the specified IRQ asserted many times without any of the interrupt handlers claiming that it was their device asserting it. It then masks the interrupt at the APIC. With device assignment this can mean that the mechanism we use to mask the device doesn't work for that device. There's a vfio-pci module option you can use to have vfio-pci mask the interrupt at the APIC rather than the device, nointxmask=1. The trouble with this option is that it can only be used with exclusive interrupts, so if any other devices share the interrupt, starting the VM will fail. As a test, you can unbind conflicting devices from their drivers (assuming non-critical devices). The troublesome point here is that regardless of x-no-kvm-intx, the kernel uses the same masking technique for the device, so it's unclear why one works and the other does not. > - What about the "x-no-kvm-msi" and "x-no-kvm-msix" options? Would it be > better to use them as well? I couldn't find any sound information about > what exactly they do (Note: Initially, I had all three of those > "x-no..." options active, which made the VM boot the first time, and > later out of curiosity found out that "x-no-kvm-intx" is the essential > one. Without this one, the VM won't boot; the other two don't seem to > change anything in my case). Similar to the INTx version, they route the interrupts out through QEMU rather than inject them through a side channel with KVM. They're just slower. Generally these options are only used for debugging as they make the interrupts visible to QEMU, functionality is generally not affected. What interrupt mode does the device operate in once the VM is running? You can run 'lspci -vs <device address>' on the host and see something like: Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=10 Masked- In this case the Enable+ shows the device is using MSI-X rather than MSI, which shows Enable-. The device might not support both (or either). If none are Enable+, legacy interrupts are probably being used. Often legacy interrupts are only used at boot and then the device switches to MSI/X. If that's the case for this device, x-no-kvm-intx doesn't really hurt you runtime. > - Could we expect your patch to go into upstream (perhaps after the > above issues / questions have been investigated)? I will try to convince > the Debian people to include the patch into 4.9; if they refuse, I will > have to compile a new kernel each time they release one, which happens > quite often (probably security fixes) since some time ... I would not recommend trying to convince Debian to take a non-upstream patch, the process is that I need to do more research to figure out why this device isn't already quirked, I'm sure others have complained, but did similar patches make things worse for them or did they simply disappear. Can you confirm whether the device behaves properly for host use with the patch? Issues with assigning the device could be considered secondary if the host behavior is obviously improved. Alternatively, the 9230, or various others in that section of the quirk code, are already quirked, so you can decide if picking a different $30 card is a better option for you ;) Thanks, Alex