Re: Unable to pass SATA controller to VM with intel_iommu=igfx_off

Binarus <lists@xxxxxxxxxx> · Wed, 10 Jan 2018 17:40:13 +0100

Alex, thank you! I think I have solved the performance problem and have
made some interesting observations.

On 09.01.2018 23:41, Alex Williamson wrote:
>> - Could you please shortly explain what exactly it wants to tell me when
>> it says that it disables INT xx, and notable if this is a bad thing I
>> should take care of?
> 
> The "Disabling IRQ XX, nobody cared" message means that the specified
> IRQ asserted many times without any of the interrupt handlers claiming
> that it was their device asserting it.  It then masks the interrupt at
> the APIC.  With device assignment this can mean that the mechanism we
> use to mask the device doesn't work for that device.  There's a
> vfio-pci module option you can use to have vfio-pci mask the interrupt
> at the APIC rather than the device, nointxmask=1.  The trouble with
> this option is that it can only be used with exclusive interrupts, so
> if any other devices share the interrupt, starting the VM will fail.
> As a test, you can unbind conflicting devices from their drivers
> (assuming non-critical devices).

This statement has put me on the right track:

First, I rebooted the machine without vfio_pci and looked into
/proc/interrupts. The SATA controller in question was bound to INT 37
and was the *only* device using that INT.

I then rebooted with vfio_pci active and tried to start the VM, passing
through the SATA controller to it. As described in my previous messages,
the console showed an error message saying that it disabled INT 16 (!)
when starting the VM.

I looked into /proc/interrupts again and noticed that INT 16 was bound
to one of the USB ports, and that this was the only device using INT 16.

Then I added nointxmask=1 to vfio_pci's options, made depmod and updated
the initramfs and kept this setting for all further experiments.

After having rebooted, I removed all "x-no-" options (the ones we talked
about recently) from the device definitions of the VM. Then I unbound
the USB port in question (i.e. the one which used INT 16) from its
driver. Although lspci was still claiming that this USB port was using
INT 16, /proc/interrupts showed that INT 16 was not bound to a driver
any more.

Then I started the VM. The console did not show any messages any more,
the VM booted without any issue, *and SATA speed was back to normal
again* (100 MB/s with nointxmask=1 and that USB port unbound versus 2
MB/s without nointxmask and without unbinding that USB port).

I have lost one USB port, but finally have full SATA hardware in the VM.
I can very well live with the lost USB port because there are plenty of
them, and it was USB 1.1 anyway. I will stick with this configuration
for the next time.

*And here is the interesting (from my naive point of view) part which
might explain what happened:*

/proc/interrupts (with the VM running!) shows that *vfio-intx is using
INT 16* now. KVM / Quemu obviously had the idea to assign INT 16 to the
vfio device *although* INT 16 was already bound to a USB port which was
active in the host, and *although* the device which is passed through
would be at INT 37 if vfio_pci would not be active.

Therefore, the console was showing the error message regarding INT 16;
obviously, the kernel / KVM / QEMU could not handle the interrupt
sharing between the host USB port and the vfio_pci device which KVM /
QEMU had made necessary.

By the way, this is the only vfio_pci device on this machine.

Should we consider this behavior a bug? Why does a vfio_pci device get
bound to an interrupt which is bound to another hardware device on the
host? Do we have any chance to influence that (modinfo vfio_pci does not
show any parameter related to interrupt numbers)?

>> - Could we expect your patch to go into upstream (perhaps after the
>> above issues / questions have been investigated)? I will try to convince
>> the Debian people to include the patch into 4.9; if they refuse, I will
>> have to compile a new kernel each time they release one, which happens
>> quite often (probably security fixes) since some time ...
> 
> I would not recommend trying to convince Debian to take a non-upstream
> patch, the process is that I need to do more research to figure out
> why this device isn't already quirked, I'm sure others have complained,
> but did similar patches make things worse for them or did they simply
> disappear.  Can you confirm whether the device behaves properly for
> host use with the patch?  Issues with assigning the device could be
> considered secondary if the host behavior is obviously improved.
> Alternatively, the 9230, or various others in that section of the
> quirk code, are already quirked, so you can decide if picking a
> different $30 card is a better option for you ;) Thanks,

I am not sure if the interrupt conflict between the USB port and
vfio_pci is related to that chipset in particular. I guess (it's really
that: a guess) that KVM or QEMU do not assign an appropriate interrupt
number to vfio_pci devices under certain circumstances. If this is the
case, it could happen with other controllers / chipsets of all kind as well.

Thus, I assume we have that controller running now. If you are
interested, I will test for a while and report back if it is stable; I
would like to keep it passed through into the VM, though, so I can't
test if it is stable for the host. However, if the letter is a high
priority thing for you, I'll revert the configuration and let it run in
the host for a week or so.

Regards and many thanks,

Binarus