Re: [RESEND PATCH v1 1/1] PCI: microchip: Fix potential race in interrupt handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 04, 2022 at 04:12:39PM +0100, Conor Dooley wrote:
> On 02/05/2022 20:22, Bjorn Helgaas wrote:
> > On Sat, Apr 30, 2022 at 12:33:51AM +0100, Marc Zyngier wrote:
> >> On Fri, 29 Apr 2022 22:57:33 +0100,
> >> Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> >>> On Fri, Apr 29, 2022 at 09:42:52AM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
> >>>> On 28/04/2022 10:29, Lorenzo Pieralisi wrote:
> >>>>> On Tue, Apr 05, 2022 at 12:17:51PM +0100, daire.mcnamara@xxxxxxxxxxxxx wrote:
> >>>>>> From: Daire McNamara <daire.mcnamara@xxxxxxxxxxxxx>
> >>>>>>
> >>>>>> Clear MSI bit in ISTATUS register after reading it before
> >>>>>> handling individual MSI bits
> > 
> >>>> Clear the MSI bit in ISTATUS register after reading it, but before
> >>>> reading and handling individual MSI bits from the IMSI register.
> >>>> This avoids a potential race where new MSI bits may be set on the
> >>>> IMSI register after it was read and be missed when the MSI bit in
> >>>> the ISTATUS register is cleared.
> > 
> >>> Honestly, I don't understand enough about IRQs to determine whether
> >>> this is a correct fix.  Hopefully Marc will chime in.  All I really
> >>> know how to do is compare all the drivers and see which ones don't fit
> >>> the typical patterns.
> >>
> >> This seems sensible. In general, edge interrupts need an early Ack
> >> *before* the handler can be run. If it happens after, you're pretty
> >> much guaranteed to lose edges that would be generated between the
> >> handler and the late Ack.
> >>
> >> This can be implemented in HW in a variety of ways (read a register,
> >> write a register, or even both).
> > 
> > Is this something that is or could be documented somewhere under
> > Documentation, e.g., "here are the common canonical patterns to use"?
> > I feel like an idiot because I have this kind of question all the time
> > and I never know how to confidently analyze it.
> 
> Daire is still having the IT issues, so before I resend the patch with
> a new commit message, how is the following:
> 
> Clear the MSI bit in ISTATUS_LOCAL register after reading it, but
> before reading and handling individual MSI bits from the ISTATUS_MSI
> register. This avoids a potential race where new MSI bits may be set
> on the ISTATUS_MSI register after it was read and be missed when the
> MSI bit in the ISTATUS_LOCAL register is cleared.

It is still unclear. You should translate what Marc said above into
how ISTATUS_MSI and ISTATUS_LOCAL work (ie describe how HW works).

Please describe what the registers do and use that to describe
the fix.

Thanks,
Lorenzo

> Reported by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Link: https://lore.kernel.org/linux-pci/20220127202000.GA126335@bhelgaas/
> Fixes: 6f15a9c9f941 ("PCI: microchip: Add Microchip PolarFire PCIe controller driver")
> Signed-off-by: Daire McNamara <daire.mcnamara@xxxxxxxxxxxxx>
> > 
> >>> And speaking of that, I looked at all the users of
> >>> irq_set_chained_handler_and_data() in drivers/pci.  All the handlers
> >>> except mc_handle_intx() and mc_handle_msi() call chained_irq_enter()
> >>> and chained_irq_exit().
> >>>
> >>> Are mc_handle_intx() and mc_handle_msi() just really special, or is
> >>> this a mistake?
> >>
> >> That's just a bug. On the right HW, this would just result in lost
> >> interrupts.
> 
> Separate issue, separate patch. Do you want them in a series or as
> another standalone patch?
> 
> Thanks,
> Conor.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux