Hi Bjorn,
On 2021/10/26 2:06, Bjorn Helgaas wrote:
On Tue, Oct 26, 2021 at 12:51:07AM +0900, Kunihiko Hayashi wrote:
Hi all,
I found that "lspci -vv" causes a recursive loop in the linux-next
kernel.
As a result, the kernel crashed on stack overflow.
This issue was reproduced using Akebi96 board with UniPhier LD20 and
DWC3 PCIe
controller, and R8169 ethernet card.
# lspci -s 01:00.0 -vv
01:00.0 Class 0200: Device 10ec:8168 (rev 06)
[ 19.152157] Insufficient stack space to handle exception!
...
[ 19.152449] Hardware name: Akebi96 (DT)
[ 19.152455] Call trace:
[ 19.152458] dump_backtrace+0x0/0x1b0
[ 19.152484] show_stack+0x20/0x30
[ 19.152503] dump_stack_lvl+0x68/0x84
[ 19.152525] dump_stack+0x18/0x34
[ 19.152542] panic+0x154/0x34c
[ 19.152556] nmi_panic+0x94/0x98
[ 19.152577] panic_bad_stack+0xec/0x100
[ 19.152590] handle_bad_stack+0x38/0x68
[ 19.152606] __bad_stack+0x8c/0x90
[ 19.152620] pci_vpd_read+0xc/0x1f8
[ 19.152639] pci_vpd_size+0x58/0x1a0
[ 19.152651] pci_vpd_read+0x1a0/0x1f8
[ 19.152669] __pci_read_vpd+0x94/0xc0
[ 19.152681] pci_vpd_size+0x58/0x1a0
[ 19.152692] pci_vpd_read+0x1a0/0x1f8
[ 19.152710] __pci_read_vpd+0x94/0xc0
[ 19.152722] pci_vpd_size+0x58/0x1a0
[ 19.152734] pci_vpd_read+0x1a0/0x1f8
[ 19.152752] __pci_read_vpd+0x94/0xc0
...
[ 19.155039] pci_vpd_size+0x58/0x1a0
[ 19.155051] pci_vpd_read+0x1a0/0x1f8
[ 19.155069] __pci_read_vpd+0x94/0xc0
[ 19.155081] pci_vpd_size+0x58/0x1a0
[ 19.155093] pci_vpd_read+0x1a0/0x1f8
[ 19.155111] __pci_read_vpd+0x94/0xc0
[ 19.155124] pci_vpd_size+0x58/0x1a0
[ 19.155136] pci_vpd_read+0x1a0/0x1f8
[ 19.155153] __pci_read_vpd+0x94/0xc0
[ 19.155166] vpd_read+0x28/0x38
[ 19.155177] sysfs_kf_bin_read+0x74/0x98
In the following commit, initialization of dev->vpd.len has been
removed.
commit 80484b7f8db101119928c73e7ce09ae6be54e45c
PCI/VPD: Use pci_read_vpd_any() in pci_vpd_size()
When calling pci_read_vpd_any(), if dev->vpd.len is zero, pci_vpd_size()
will continue to be called recursively.
pci_vpd_available() // dev->vpd.len == 0
-> pci_vpd_size()
-> pci_read_vpd_any()
-> __pci_read_vpd()
-> pci_vpd_read()
-> pci_vpd_available() // dev->vpd.len == 0
-> pci_vpd_size()
...
This issue didn't occur before applying this commit.
Does anyone run into the same issue?
Likely this patch:
https://lore.kernel.org/r/6211be8a-5d10-8f3a-6d33-af695dc35caf@xxxxxxxxx
which I obviously need to move to the top of my list. If you can
confirm that this fixes it, that would be awesome!
Thank you for your information.
I confirmed that the issue was fixed after applying the patch.
Thank you,
---
Best Regards
Kunihiko Hayashi