Re: [PATCH] PCI: dwc: fix MSI IRQ handler ordering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/01/2019 17:16, Stephen Warren wrote:
> On 1/7/19 2:36 AM, Gustavo Pimentel wrote:
>> Hi Stephen,
>>
>> On 04/01/2019 21:45, Stephen Warren wrote:
>>> From: Stephen Warren <swarren@xxxxxxxxxx>
>>>
>>> The current code does this when handling MSI IRQs:
>>>
>>> a) Process the irq.
>>> b) Clear the latched IRQ status.
>>>
>>> If a new IRQ occurs any time after (a) has read the IRQ status for the
>>> last time and before (b), it will be lost. For example, this occurs in
>>> practice when using a Marvell 9171 AHCI controller with NCQ enabled;
>>> many command timeouts occur with certain disk access patterns.
>>>
>>> Fix the code to do the following instead, so that if any new IRQs are
>>> raised during the processing of the IRQ, the IRQ status is not cleared,
>>> so that the IRQ is not lost.
>>>
>>> a) Clear the latched IRQ status.
>>> b) Process the IRQ.
>>>
>>> This change reverts commit 8c934095fa2f ("PCI: dwc: Clear MSI interrupt
>>> status after it is handled, not before")
>>>
>>> This change re-applies commit ca1658921b63 ("PCI: designware: Fix
>>> missing MSI IRQs")
>>>
>>> Signed-off-by: Stephen Warren <swarren@xxxxxxxxxx>
>>> Cc: Arnd Bergmann <arnd@xxxxxxxx>
>>> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>>> Cc: Faiz Abbas <faiz_abbas@xxxxxx>
>>> Cc: Harro Haan <hrhaan@xxxxxxxxx>
>>> Cc: Jingoo Han <jg1.han@xxxxxxxxxxx>
>>> Cc: Joao Pinto <jpinto@xxxxxxxxxxxx>
>>> Cc: Juergen Beisert <jbe@xxxxxxxxxxxxxx>
>>> Cc: Marek Vasut <marex@xxxxxxx>
>>> Cc: Matthias Mann <m.mann@xxxxxxxxxxxxxxxxxxxxxx>
>>> Cc: Mohit Kumar <mohit.kumar@xxxxxx>
>>> Cc: Pratyush Anand <pratyush.anand@xxxxxx>
>>> Cc: Richard Zhu <hong-xing.zhu@xxxxxxxxxxxxx>
>>> Cc: Sean Cross <xobs@xxxxxxxxxx>
>>> Cc: Shawn Guo <shawn.guo@xxxxxxxxxx>
>>> Cc: Siva Reddy Kallam <siva.kallam@xxxxxxxxxxx>
>>> Cc: Srikanth T Shivanand <ts.srikanth@xxxxxxxxxxx>
>>> Cc: Tim Harvey <tharvey@xxxxxxxxxxxxx>
>>> ---
>>> Note: This issue was found in downstream NVIDIA 4.9 and 4.14 kernels.
>>> However, the exact same code structure is present in mainline and I have
>>> no reason to believe the problem would not reproduce there. I have
>>> compile tested but not runtime tested it in mainline, since my board is
>>> not yet supported in mainline.
>>> ---
>>>   drivers/pci/controller/dwc/pcie-designware-host.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
>>> index 692dd1b264fb..7fd6c56a6f35 100644
>>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>>> @@ -98,10 +98,10 @@ irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
>>>   			irq = irq_find_mapping(pp->irq_domain,
>>>   					       (i * MAX_MSI_IRQS_PER_CTRL) +
>>>   					       pos);
>>> -			generic_handle_irq(irq);
>>>   			dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS +
>>>   						(i * MSI_REG_CTRL_BLOCK_SIZE),
>>>   					    4, 1 << pos);
>>> +			generic_handle_irq(irq);
>>>   			pos++;
>>>   		}
>>>   	}
>>
>> This fix was already suggested by Trent Piepho, however, after review by Marc
>> Zyngier and some information obtained from Synopsys IP development team, the
>> following set of patches was generated to correct the problem.
>>
>> 830920e065e9 ("PCI: dwc: Use interrupt masking instead of disabling")
>> fce5423e4f43 ("PCI: dwc: Take lock when ACKing an interrupt")
>> 3f7bb2ec20ce ("PCI: dwc: Move interrupt acking into the proper callback")
> 
> OK. I assume that dw_pci_bottom_ack() gets called before 
> dw_handle_msi_irq() for each IRQ? In which case, the last of those 3 

Yes, the dw_pci_bottom_ack() gets called before dw_handle_msi_irq().

> changes is essentially identical to the patch I proposed, it's just that 
> the acking is handled in a separate function rather than earlier in the 
> same function.
> 

No, there is also two other fixes. A missing lock/unlock on the ack and the
correct use of the MASK/UNMASK interrupt register instead of ENABLE interrupt
register.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux