Re: [RESEND PATCH v1 1/1] PCI: microchip: Fix potential race in interrupt handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/05/2022 20:22, Bjorn Helgaas wrote:
> On Sat, Apr 30, 2022 at 12:33:51AM +0100, Marc Zyngier wrote:
>> On Fri, 29 Apr 2022 22:57:33 +0100,
>> Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>>> On Fri, Apr 29, 2022 at 09:42:52AM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
>>>> On 28/04/2022 10:29, Lorenzo Pieralisi wrote:
>>>>> On Tue, Apr 05, 2022 at 12:17:51PM +0100, daire.mcnamara@xxxxxxxxxxxxx wrote:
>>>>>> From: Daire McNamara <daire.mcnamara@xxxxxxxxxxxxx>
>>>>>>
>>>>>> Clear MSI bit in ISTATUS register after reading it before
>>>>>> handling individual MSI bits
> 
>>>> Clear the MSI bit in ISTATUS register after reading it, but before
>>>> reading and handling individual MSI bits from the IMSI register.
>>>> This avoids a potential race where new MSI bits may be set on the
>>>> IMSI register after it was read and be missed when the MSI bit in
>>>> the ISTATUS register is cleared.
> 
>>> Honestly, I don't understand enough about IRQs to determine whether
>>> this is a correct fix.  Hopefully Marc will chime in.  All I really
>>> know how to do is compare all the drivers and see which ones don't fit
>>> the typical patterns.
>>
>> This seems sensible. In general, edge interrupts need an early Ack
>> *before* the handler can be run. If it happens after, you're pretty
>> much guaranteed to lose edges that would be generated between the
>> handler and the late Ack.
>>
>> This can be implemented in HW in a variety of ways (read a register,
>> write a register, or even both).
> 
> Is this something that is or could be documented somewhere under
> Documentation, e.g., "here are the common canonical patterns to use"?
> I feel like an idiot because I have this kind of question all the time
> and I never know how to confidently analyze it.

Daire is still having the IT issues, so before I resend the patch with
a new commit message, how is the following:

Clear the MSI bit in ISTATUS_LOCAL register after reading it, but
before reading and handling individual MSI bits from the ISTATUS_MSI
register. This avoids a potential race where new MSI bits may be set
on the ISTATUS_MSI register after it was read and be missed when the
MSI bit in the ISTATUS_LOCAL register is cleared.

Reported by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
Link: https://lore.kernel.org/linux-pci/20220127202000.GA126335@bhelgaas/
Fixes: 6f15a9c9f941 ("PCI: microchip: Add Microchip PolarFire PCIe controller driver")
Signed-off-by: Daire McNamara <daire.mcnamara@xxxxxxxxxxxxx>
> 
>>> And speaking of that, I looked at all the users of
>>> irq_set_chained_handler_and_data() in drivers/pci.  All the handlers
>>> except mc_handle_intx() and mc_handle_msi() call chained_irq_enter()
>>> and chained_irq_exit().
>>>
>>> Are mc_handle_intx() and mc_handle_msi() just really special, or is
>>> this a mistake?
>>
>> That's just a bug. On the right HW, this would just result in lost
>> interrupts.

Separate issue, separate patch. Do you want them in a series or as
another standalone patch?

Thanks,
Conor.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux