Re: [PATCH] PCI: Fix devres regression in pci_intx()

Damien Le Moal <dlemoal@xxxxxxxxxx> · Thu, 5 Sep 2024 09:33:35 +0900

On 2024/09/05 6:10, Alex Williamson wrote:
> On Wed, 4 Sep 2024 23:24:53 +0300
> Andy Shevchenko <andy.shevchenko@xxxxxxxxx> wrote:
> 
>> Wed, Sep 04, 2024 at 12:07:21PM -0600, Alex Williamson kirjoitti:
>>> On Wed, 04 Sep 2024 15:37:25 +0200
>>> Philipp Stanner <pstanner@xxxxxxxxxx> wrote:  
>>>> On Wed, 2024-09-04 at 17:25 +0900, Damien Le Moal wrote:  
>>
>> ...
>>
>>>> If vfio-pci can get rid of pci_intx() alltogether, that might be a good
>>>> thing. As far as I understood Andy Shevchenko, pci_intx() is outdated.
>>>> There's only a hand full of users anyways.  
>>>
>>> What's the alternative?  
>>
>> From API perspective the pci_alloc_irq_vectors() & Co should be used.
> 
> We can't replace a device level INTx control with a vector allocation
> function.
>  
>>> vfio-pci has a potentially unique requirement
>>> here, we don't know how to handle the device interrupt, we only forward
>>> it to the userspace driver.  As a level triggered interrupt, INTx will
>>> continue to assert until that userspace driver handles the device.
>>> That's obviously unacceptable from a host perspective, so INTx is
>>> masked at the device via pci_intx() where available, or at the
>>> interrupt controller otherwise.  The API with the userspace driver
>>> requires that driver to unmask the interrupt, again resulting in a call
>>> to pci_intx() or unmasking the interrupt controller, in order to receive
>>> further interrupts from the device.  Thanks,  
>>
>> I briefly read the discussion and if I understand it correctly the problem here
>> is in the flow: when the above mentioned API is being called. Hence it's design
>> (or architectural) level of issue and changing call from foo() to bar() won't
>> magically make problem go away. But I might be mistaken.
> 
> Certainly from a vector allocation standpoint we can change to whatever
> is preferred, but the direct INTx manipulation functions are a
> different thing entirely and afaik there's nothing else that can
> replace them at a low level, nor can we just get rid of our calls to
> pci_intx().  Thanks,

But can these calls be moved out of the spinlock context ? If not, then we need
to clarify that pci_intx() can be called from any context, which will require
changing to a GFP_ATOMIC for the resource allocation, even if the use case
cannot trigger the allocation. This is needed to ensure the correctness of the
pci_intx() function use. Frankly, I am surprised that the might sleep splat you
got was not already reported before (fuzzying, static analyzers might eventually
catch that though).

The other solution would be a version of pci_intx() that has a gfp flags
argument to allow callers to use the right gfp flags for the call context.

> 
> Alex
> 

-- 
Damien Le Moal
Western Digital Research