Re: [RFC PATCH] PCI: Introduce INTx check & mask API

Alexey Kardashevskiy <aik@xxxxxxxxx> · Fri, 25 May 2012 12:47:35 +1000

On 25/05/12 12:29, Jan Kiszka wrote:
> On 2012-05-24 22:18, Alexey Kardashevskiy wrote:
>> On 24/05/12 22:02, Jan Kiszka wrote:
>>> On 2012-05-24 04:44, Alexey Kardashevskiy wrote:
>>>> [Found while debugging VFIO on POWER but it is platform independent]
>>>>
>>>> There is a feature in PCI (>=2.3?) to mask/unmask INTx via PCI_COMMAND and
>>>> PCI_STATUS registers.
>>>
>>> Yes, 2.3 introduced this. Masking is done via command register, checking
>>> if the source was the PCI in question via the status register. The
>>> latter is important for supporting IRQ sharing - and that's why we
>>> introduced this masking API to the PCI layer.
>>
>>
>> Is not it just a quite small optimization to not to disable interrupts on all devices which share
>> the same IRQ but just on those who fired an interrupt? If so, do PCI devices really often share
>> IRQs? Does not supporting this mean real slowdown on such devices?
>>
>> As far as I understand, everyone who cares about performance uses MSI/MSIX, no?
> 
> Not everyone is blessed with MSI-only PCI devices. From my notebook:
> 
> # cat /proc/interrupts
> [...]
>  22: [...] IO-APIC-fasteoi   ehci_hcd:usb1, ehci_hcd:usb2
> 
> So, if I want to assign one EHCI controller to a guest, I have to
> disable the other as well. The same can happen quickly if you attach a
> few legacy PCI adapters to a system and want to pass them through.

Why? vfio-pci receives interrupt, disables it, handles it, enables interrupt back. Yes, handling is
a bit longer and includes passing interrupt to QEMU and then to the guest (can be optimized to avoid
QEMU) and waiting for EOI notification but this is all the difference.

Does the current kernel use INTx bit for your USB controllers now, without any KVM, etc?

So, is it just an optimization or it is something bigger that I missed?

>>>> And there is some API to support that (commit a2e27787f893621c5a6b865acf6b7766f8671328).
>>>>
>>>> I have a network adapter:
>>>> 0001:00:01.0 Ethernet controller: Chelsio Communications Inc T310 10GbE Single Port Adapter
>>>> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
>>>> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>
>>>> pci_intx_mask_supported() reports that the feature is supported for this adapter
>>>> BUT the adapter does not set PCI_STATUS_INTERRUPT so pci_check_and_set_intx_mask()
>>>> never changes PCI_COMMAND and INTx does not work on it when we use it as VFIO-PCI device.
>>>>
>>>> If I remove the check of this bit, it works fine as it is called from an interrupt handler and
>>>> Status bit check is redundant.
>>>>
>>>> Opened a spec:
>>>> PCI LOCAL BUS SPECIFICATION, REV. 3.0, Table 6-2: Status Register Bits
>>>> ===
>>>> 3	This read-only bit reflects the state of the interrupt in the
>>>> device/function. Only when the Interrupt Disable bit in the command
>>>> register is a 0 and this Interrupt Status bit is a 1, will the
>>>> device’s/function’s INTx# signal be asserted. Setting the Interrupt
>>>>    Disable bit to a 1 has no effect on the state of this bit.
>>>> ===
>>>> With this adapter, INTx# is asserted but Status bit is still 0.
>>>>
>>>> Is it mandatory for a device to set Status bit if it supports INTx masking?
>>>>
>>>> 2 Alex: if it is mandatory, then we need to be able to disable pci_2_3 in VFIO-PCI
>>>> somehow.
>>>
>>> Since PCI 2.3, this bit is mandatory, and it should be independent of
>>> the masking bit. The question is, if your device is supposed to support
>>> 2.3, thus is just buggy, or if our detection algorithm is unreliable. It
>>> basically builds on the assumption that, if we can flip the mask bit,
>>> the feature should be present. I guess that is the best we can do. Maybe
>>> we can augment this with a blacklist of devices that "support" flipping
>>> without actually providing the feature.
>>
>> It is a good moment to start :)
>> Not sure where - in VFIO or along with that PCI INTx API.
> 
> At PCI level as the API is VFIO agnostic (it was introduced for
> "classic" KVM device assignment, in fact).
>> Here is that broken device:
>> aik@vpl2:~$ lspci -s 1:1:0.0
>> 0001:01:00.0 Ethernet controller: Chelsio Communications Inc T310 10GbE Single Port Adapter
>> aik@vpl2:~$ lspci -ns 1:1:0.0
>> 0001:01:00.0 0200: 1425:0030
> 
> A patch to add the infrastructure as well would be even more welcome. :)
> You could have a look at drivers/pci/quirks.c for patterns how to do this.

I am not sure yet that we need this feature at all ;) I would rather prefer to have some way to
disable it in VFIO rather than to add yet another quirk for the feature which nobody uses at the moment.
Really, this device supports MSI/MSIX and in real life nobody is going to use INTx on it. The only
need for it is testing.

-- 
Alexey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html