On Thu, Jun 29, 2023 at 04:42:07PM -0400, Radu Rendec wrote: > On Thu, 2023-06-29 at 14:57 -0500, Bjorn Helgaas wrote: > > On Thu, Jun 29, 2023 at 02:30:19PM -0400, Radu Rendec wrote: > > > The DesignWare PCIe host driver uses a chained interrupt to demultiplex > > > the downstream MSI interrupts. On Qualcomm SA8540P Ride, enabling both > > > pcie2a and pcie3a at the same time can create an interrupt storm where > > > the parent interrupt fires continuously, even though reading the PCIe > > > host registers doesn't identify any child MSI interrupt source. This > > > effectively locks up CPU0, which spends all the time servicing these > > > interrupts. > > > > > > This is a clear example of how bypassing the interrupt core by using > > > chained interrupts can be very dangerous if the hardware misbehaves. > > > > > > Convert the driver to use a regular interrupt for the demultiplex > > > handler. This allows the interrupt storm detector to detect the faulty > > > interrupt and disable it, allowing the system to run normally. > > > > There are many other users of irq_set_chained_handler_and_data() in > > drivers/pci/controller/. Should they be similarly converted? If not, > > how do we decide which need to use irq_set_chained_handler_and_data() > > and which do not? > > According to Thomas Gleixner, yes. Obviously I don't want to put words > in his mouth, but I think that's the gist of what he said in a reply to > an RFC patch that I sent a few weeks ago: > https://lore.kernel.org/all/877csohcll.ffs@tglx/ Is it's a bug in pcie-designware-host.c, and it's also a bug in the other callers, we should fix them all. But you do imply that there's some DesignWare hardware issue involved, too, so I guess it's possible the other drivers don't have an issue and/or actually require the chained IRQs. That's why I asked how we should decide. > > > -static void dw_pcie_free_msi(struct dw_pcie_rp *pp) > > > +static void __dw_pcie_free_msi(struct dw_pcie_rp *pp, u32 num_ctrls) > > > { > > > u32 ctrl; > > > > > > - for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++) { > > > + for (ctrl = 0; ctrl < num_ctrls; ctrl++) { > > > if (pp->msi_irq[ctrl] > 0) > > > - irq_set_chained_handler_and_data(pp->msi_irq[ctrl], > > > - NULL, NULL); > > > + free_irq(pp->msi_irq[ctrl], pp); > > > } > > > > > > irq_domain_remove(pp->msi_domain); > > > irq_domain_remove(pp->irq_domain); > > > } > > > > > > +#define dw_pcie_free_msi(pp) __dw_pcie_free_msi(pp, MAX_MSI_CTRLS) > > > > What is the benefit of the dw_pcie_free_msi() macro? > > It allows me to add the num_ctrls parameter to the corresponding > function (now renamed to __dw_pcie_free_msi()) without forcing all the > existing call sites to send MAX_MSI_CTRLS explicitly. > > I needed that extra parameter to avoid duplicating the tear down code > on the (new) error path in dw_pcie_msi_init() - see below. > > > > static void dw_pcie_msi_init(struct dw_pcie_rp *pp) > > > { > > > struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > > > @@ -361,9 +353,16 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > > > return ret; > > > > > > for (ctrl = 0; ctrl < num_ctrls; ctrl++) { > > > - if (pp->msi_irq[ctrl] > 0) > > > - irq_set_chained_handler_and_data(pp->msi_irq[ctrl], > > > - dw_chained_msi_isr, pp); > > > + if (pp->msi_irq[ctrl] > 0) { > > > + ret = request_irq(pp->msi_irq[ctrl], dw_pcie_msi_isr, 0, > > > + dev_name(dev), pp); > > > + if (ret) { > > > + dev_err(dev, "Failed to request irq %d: %d\n", > > > + pp->msi_irq[ctrl], ret); > > > + __dw_pcie_free_msi(pp, ctrl); > > This is where I'm using the extra parameter. If we fail to request an > interrupt, we need to free all the other interrupts that we have > requested so far, to leave everything in a clean state. But we can't > use MAX_MSI_CTRLS with __dw_pcie_free_msi() and rely on the check there > because there may be extra interrupts that we haven't requested *yet* > and we would attempt to free them. Makes sense, thanks. Bjorn