Re: Kernel panic with niu module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/4/24 17:44, Bjorn Helgaas wrote:
> [+cc Thomas, author of 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X
> entries")]
> 
> On Mon, Nov 04, 2024 at 05:34:42AM -0600, Dullfire wrote:
>> I have also bisected the kernel, and determined that  upstream commit
>> 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f revealed this issue. This commit
>> adds read to the mask status before any write to PCI_MSIX_ENTRY_DATA, thus
>> provoking the issue.
> 
> 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") appeared in
> v5.14 in 2021.  Surely other drivers use MSI-X and would have been
> tested on sparcv9 since then?  Just based on the age of 7d5ec3d36123,
> I would guess some kind of niu issue.  But Thomas will know much more.

Yeah, I wasn't very clear: I believe this problem is specific to the niu
module. My suspicion is hardware errata and/or an issue in the builtin
hypervisor.

My T5240 has several other PCIe devices, none of which exhibit this issue.
I will have to check later if any use MSIX.

Speaking of test cases: It is worth pointing out that any write to ENTRY_DATA
appears to be sufficient to allow subsequent reads to that MSIX table entry
to work. Notably, booting into a pre 7d5ec3d36123 kernel, and then rebooting
into a newer kernel will succeed, because the registers were written to under
the old kernel. I had to power off the unit to reproduce the issue if a
kernel successfully initialized the device.


Regards,
Jonathan Currier





[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux