On Thu, Jun 29, 2023 at 02:30:19PM -0400, Radu Rendec wrote: > The DesignWare PCIe host driver uses a chained interrupt to demultiplex > the downstream MSI interrupts. On Qualcomm SA8540P Ride, enabling both > pcie2a and pcie3a at the same time can create an interrupt storm where > the parent interrupt fires continuously, even though reading the PCIe > host registers doesn't identify any child MSI interrupt source. This > effectively locks up CPU0, which spends all the time servicing these > interrupts. > > This is a clear example of how bypassing the interrupt core by using > chained interrupts can be very dangerous if the hardware misbehaves. > > Convert the driver to use a regular interrupt for the demultiplex > handler. This allows the interrupt storm detector to detect the faulty > interrupt and disable it, allowing the system to run normally. There are many other users of irq_set_chained_handler_and_data() in drivers/pci/controller/. Should they be similarly converted? If not, how do we decide which need to use irq_set_chained_handler_and_data() and which do not? > Signed-off-by: Radu Rendec <rrendec@xxxxxxxxxx> > --- > .../pci/controller/dwc/pcie-designware-host.c | 35 +++++++++---------- > 1 file changed, 17 insertions(+), 18 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c > index 9952057c8819c..b603796d415d7 100644 > --- a/drivers/pci/controller/dwc/pcie-designware-host.c > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c > @@ -83,18 +83,9 @@ irqreturn_t dw_handle_msi_irq(struct dw_pcie_rp *pp) > return ret; > } > > -/* Chained MSI interrupt service routine */ > -static void dw_chained_msi_isr(struct irq_desc *desc) > +static irqreturn_t dw_pcie_msi_isr(int irq, void *dev_id) > { > - struct irq_chip *chip = irq_desc_get_chip(desc); > - struct dw_pcie_rp *pp; > - > - chained_irq_enter(chip, desc); > - > - pp = irq_desc_get_handler_data(desc); > - dw_handle_msi_irq(pp); > - > - chained_irq_exit(chip, desc); > + return dw_handle_msi_irq(dev_id); > } > > static void dw_pci_setup_msi_msg(struct irq_data *d, struct msi_msg *msg) > @@ -254,20 +245,21 @@ int dw_pcie_allocate_domains(struct dw_pcie_rp *pp) > return 0; > } > > -static void dw_pcie_free_msi(struct dw_pcie_rp *pp) > +static void __dw_pcie_free_msi(struct dw_pcie_rp *pp, u32 num_ctrls) > { > u32 ctrl; > > - for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++) { > + for (ctrl = 0; ctrl < num_ctrls; ctrl++) { > if (pp->msi_irq[ctrl] > 0) > - irq_set_chained_handler_and_data(pp->msi_irq[ctrl], > - NULL, NULL); > + free_irq(pp->msi_irq[ctrl], pp); > } > > irq_domain_remove(pp->msi_domain); > irq_domain_remove(pp->irq_domain); > } > > +#define dw_pcie_free_msi(pp) __dw_pcie_free_msi(pp, MAX_MSI_CTRLS) What is the benefit of the dw_pcie_free_msi() macro? > static void dw_pcie_msi_init(struct dw_pcie_rp *pp) > { > struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > @@ -361,9 +353,16 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > return ret; > > for (ctrl = 0; ctrl < num_ctrls; ctrl++) { > - if (pp->msi_irq[ctrl] > 0) > - irq_set_chained_handler_and_data(pp->msi_irq[ctrl], > - dw_chained_msi_isr, pp); > + if (pp->msi_irq[ctrl] > 0) { > + ret = request_irq(pp->msi_irq[ctrl], dw_pcie_msi_isr, 0, > + dev_name(dev), pp); > + if (ret) { > + dev_err(dev, "Failed to request irq %d: %d\n", > + pp->msi_irq[ctrl], ret); > + __dw_pcie_free_msi(pp, ctrl); > + return ret; > + } > + } > } > > /* > -- > 2.41.0 >