On Thu, Jun 23, 2022 at 09:31:40PM +0100, Marc Zyngier wrote: > On Thu, 23 Jun 2022 17:49:42 +0100, > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Thu, Jun 23, 2022 at 06:32:40PM +0200, Pali Rohár wrote: > > > On Thursday 23 June 2022 11:27:47 Bjorn Helgaas wrote: > > > > On Tue, May 24, 2022 at 02:28:17PM +0200, Pali Rohár wrote: > > > > > Same as in commit a3b69dd0ad62 ("Revert "PCI: aardvark: > > > > > Rewrite IRQ code to chained IRQ handler"") for pci-aardvark > > > > > driver, use devm_request_irq() instead of chained IRQ > > > > > handler in pci-mvebu.c driver. > > > > > > > > > > This change fixes affinity support and allows to pin > > > > > interrupts from different PCIe controllers to different CPU > > > > > cores. > > > > > > > > Several other drivers use irq_set_chained_handler_and_data(). > > > > Do any of them need similar changes? > > > > > > I do not know. This needs testing on HW which use those other > > > drivers. > > > > > > > The commit log suggests that using chained IRQ handlers breaks > > > > affinity support. But perhaps that's not the case and the > > > > real culprit is some other difference between mvebu and the > > > > other drivers. > > > > > > It is possible. But similar patch (revert; linked below) was > > > required for aardvark. I tested same approach on mvebu and it > > > fixed affinity support. > > > > This feels like something we should understand better. If > > irq_set_chained_handler_and_data() is a problem for affinity, we > > should fix it across the board in all the drivers at once. > > > > If the real problem is something different, we should figure that > > out and document it in the commit log. > > > > I cc'd Marc in case he has time to educate us. > > Thanks for roping me in. > > The problem of changing affinities for chained (or multiplexing) > interrupts is, to make it short, that it breaks the existing > userspace ABI that a change in affinity affects only the interrupt > userspace acts upon, and no other. Which is why we don't expose any > affinity setting for such an interrupt, as by definition changing > its affinity affects all the interrupts that are muxed onto it. > > By turning a chained interrupt into a normal handler, people work > around the above rule and break the contract the kernel has with > userspace. > > Do I think this is acceptable? No. Can something be done about this? > Maybe. > > Marek asked this exact question last month[1], and I gave a detailed > explanation of what could be done to improve matters, allowing this > to happen as long as userspace is made aware of the effects, which > means creating a new userspace ABI. > > I would rather see people work on a proper solution than add bad > hacks that only work in environments where the userspace ABI can be > safely ignored, such as on an closed, embedded device. > > [1] https://lore.kernel.org/all/20220502102137.764606ee@thinkpad/ OK, this patch [2] has languished forever, and I don't know how to move forward. The patch basically changes mvebu_pcie_irq_handler() from a chained IRQ handler to a handler registered with devm_request_irq() so it can be pinned to a CPU. How are we supposed to decide whether this is safe? What should we look at in the patch? IIUC on mvebu, there's a single IRQ (port->intx_irq, described by a DT "intx" in interrupt-names) that invokes mvebu_pcie_irq_handler(), which loops through and handles INTA, INTB, INTC, and INTD. I think if port->intx_irq is pinned to CPU X, that means INTA, INTB, INTC, and INTD are all pinned to that same CPU. I assume changing to devm_request_irq() means port->intx_irq will appear in /proc/interrupts and can be pinned to a CPU. Is it a problem if INTA, INTB, INTC, and INTD for that controller all effectively share intx_irq and are pinned to the same CPU? AFAICT we currently have three PCI host controllers with INTx handlers that are registered with devm_request_irq(), which is what [2] proposes to do: advk_pcie_irq_handler() xilinx_pl_dma_pcie_intx_flow() xilinx_pcie_intr_handler() Do we assume that these are mistakes that shouldn't be emulated? [2] https://lore.kernel.org/r/20220524122817.7199-1-pali@xxxxxxxxxx