On 11/4/24 17:44, Bjorn Helgaas wrote: > [+cc Thomas, author of 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X > entries")] > > On Mon, Nov 04, 2024 at 05:34:42AM -0600, Dullfire wrote: >> I have also bisected the kernel, and determined that upstream commit >> 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f revealed this issue. This commit >> adds read to the mask status before any write to PCI_MSIX_ENTRY_DATA, thus >> provoking the issue. > > 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") appeared in > v5.14 in 2021. Surely other drivers use MSI-X and would have been > tested on sparcv9 since then? Just based on the age of 7d5ec3d36123, > I would guess some kind of niu issue. But Thomas will know much more. Yeah, I wasn't very clear: I believe this problem is specific to the niu module. My suspicion is hardware errata and/or an issue in the builtin hypervisor. My T5240 has several other PCIe devices, none of which exhibit this issue. I will have to check later if any use MSIX. Speaking of test cases: It is worth pointing out that any write to ENTRY_DATA appears to be sufficient to allow subsequent reads to that MSIX table entry to work. Notably, booting into a pre 7d5ec3d36123 kernel, and then rebooting into a newer kernel will succeed, because the registers were written to under the old kernel. I had to power off the unit to reproduce the issue if a kernel successfully initialized the device. Regards, Jonathan Currier