Hi again, Marc, On Thu, 7 Oct 2021 at 15:42, Marc Zyngier <maz@xxxxxxxxxx> wrote: > > Right. Let's see if we can be less brutal and only quirk the AHCI > device (patch below, completely untested). I'm a bit concerned that > all the devices in this system seem to report 'Maskable-'... True. However… rui@vedder:~$ cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 124 0 0 0 IO-APIC 2-edge timer 1: 0 0 0 0 IO-APIC 1-edge i8042 8: 0 0 0 1 IO-APIC 8-edge rtc0 9: 0 0 0 0 IO-APIC 9-fasteoi acpi 12: 0 1 0 0 IO-APIC 12-edge i8042 20: 0 0 12734 852750 IO-APIC 20-fasteoi ehci_hcd:usb2, enp0s10 21: 25 0 0 0 IO-APIC 21-fasteoi ohci_hcd:usb4 22: 25672 288 0 0 IO-APIC 22-fasteoi ehci_hcd:usb1 23: 0 0 0 709 IO-APIC 23-fasteoi ohci_hcd:usb3, snd_hda_intel:card0 29: 0 0 83164 1779 PCI-MSI 1572864-edge nvkm 30: 3595 5645 0 0 PCI-MSI 180224-edge ahci[0000:00:0b.0] NMI: 0 0 0 0 Non-maskable interrupts LOC: 202323 194669 107282 197322 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 Performance monitoring interrupts IWI: 0 0 0 0 IRQ work interrupts RTR: 0 0 0 0 APIC ICR read retries RES: 179 995 208 273 Rescheduling interrupts CAL: 1149 1495 949 1211 Function call interrupts TLB: 110 76 79 79 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 20 20 20 20 Machine check polls ERR: 1 MIS: 0 PIN: 0 0 0 0 Posted-interrupt notification event NPI: 0 0 0 0 Nested posted-interrupt event PIW: 0 0 0 0 Posted-interrupt wakeup event rui@vedder:~$ … the only devices using MSIs are the AHCI controller and the GPU, so I think any damage would be more contained (and obvious), in this case. > > M. > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index 0099a00af361..2f9ec7210991 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -479,6 +479,9 @@ msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd) > goto out; > > pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control); > + /* Lies, damned lies, and MSIs */ Best comment ever. :) > + if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING) > + control |= PCI_MSI_FLAGS_MASKBIT; > > entry->msi_attrib.is_msix = 0; > entry->msi_attrib.is_64 = !!(control & PCI_MSI_FLAGS_64BIT); > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 4537d1ea14fd..dc7741431bf3 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5795,3 +5795,9 @@ static void apex_pci_fixup_class(struct pci_dev *pdev) > } > DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, > PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); > + > +static void nvidia_ion_ahci_fixup(struct pci_dev *pdev) > +{ > + pdev->dev_flags |= PCI_MSI_FLAGS_MASKBIT; > +} > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0ab8, nvidia_ion_ahci_fixup); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index cd8aa6fce204..152a4d74f87f 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -233,6 +233,8 @@ enum pci_dev_flags { > PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10), > /* Don't use Relaxed Ordering for TLPs directed at this device */ > PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11), > + /* Device does honor MSI masking despite saying otherwise */ > + PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12), > }; > > enum pci_irq_reroute_variant { > > > -- I'm taking this one for a ride too and report back. Thanks, Rui