Hi Bjorn, On Thu, 23 Jun 2022 17:49:42 +0100, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > [+cc Marc, IRQ affinity vs chained IRQ handlers] > > On Thu, Jun 23, 2022 at 06:32:40PM +0200, Pali Rohár wrote: > > On Thursday 23 June 2022 11:27:47 Bjorn Helgaas wrote: > > > On Tue, May 24, 2022 at 02:28:17PM +0200, Pali Rohár wrote: > > > > Same as in commit a3b69dd0ad62 ("Revert "PCI: aardvark: Rewrite > > > > IRQ code to chained IRQ handler"") for pci-aardvark driver, use > > > > devm_request_irq() instead of chained IRQ handler in pci-mvebu.c > > > > driver. > > > > > > > > This change fixes affinity support and allows to pin interrupts > > > > from different PCIe controllers to different CPU cores. > > > > > > Several other drivers use irq_set_chained_handler_and_data(). Do > > > any of them need similar changes? > > > > I do not know. This needs testing on HW which use those other > > drivers. > > > > > The commit log suggests that using chained IRQ handlers breaks > > > affinity support. But perhaps that's not the case and the real > > > culprit is some other difference between mvebu and the other > > > drivers. > > > > It is possible. But similar patch (revert; linked below) was > > required for aardvark. I tested same approach on mvebu and it fixed > > affinity support. > > This feels like something we should understand better. If > irq_set_chained_handler_and_data() is a problem for affinity, we > should fix it across the board in all the drivers at once. > > If the real problem is something different, we should figure that out > and document it in the commit log. > > I cc'd Marc in case he has time to educate us. Thanks for roping me in. The problem of changing affinities for chained (or multiplexing) interrupts is, to make it short, that it breaks the existing userspace ABI that a change in affinity affects only the interrupt userspace acts upon, and no other. Which is why we don't expose any affinity setting for such an interrupt, as by definition changing its affinity affects all the interrupts that are muxed onto it. By turning a chained interrupt into a normal handler, people work around the above rule and break the contract the kernel has with userspace. Do I think this is acceptable? No. Can something be done about this? Maybe. Marek asked this exact question last month[1], and I gave a detailed explanation of what could be done to improve matters, allowing this to happen as long as userspace is made aware of the effects, which means creating a new userspace ABI. I would rather see people work on a proper solution than add bad hacks that only work in environments where the userspace ABI can be safely ignored, such as on an closed, embedded device. HTH, M. [1] https://lore.kernel.org/all/20220502102137.764606ee@thinkpad/ -- Without deviation from the norm, progress is not possible.