On Tue, Mar 11, 2025 at 09:52:28PM +0800, Bo Sun wrote: > On our Marvell OCTEON CN96XX board, we observed the following panic on > the latest kernel: > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 > CPU: 22 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc6 #20 > Hardware name: Marvell OcteonTX CN96XX board (DT) > pc : of_pci_add_properties+0x278/0x4c8 > Call trace: > of_pci_add_properties+0x278/0x4c8 (P) > of_pci_make_dev_node+0xe0/0x158 > pci_bus_add_device+0x158/0x228 > pci_bus_add_devices+0x40/0x98 > pci_host_probe+0x94/0x118 > pci_host_common_probe+0x130/0x1b0 > platform_probe+0x70/0xf0 > > The dmesg logs indicated that the PCI bridge was scanning with an invalid bus range: > pci-host-generic 878020000000.pci: PCI host bridge to bus 0002:00 > pci_bus 0002:00: root bus resource [bus 00-ff] > pci 0002:00:00.0: scanning [bus f9-f9] behind bridge, pass 0 > pci 0002:00:01.0: scanning [bus fa-fa] behind bridge, pass 0 > pci 0002:00:02.0: scanning [bus fb-fb] behind bridge, pass 0 > pci 0002:00:03.0: scanning [bus fc-fc] behind bridge, pass 0 > pci 0002:00:04.0: scanning [bus fd-fd] behind bridge, pass 0 > pci 0002:00:05.0: scanning [bus fe-fe] behind bridge, pass 0 > pci 0002:00:06.0: scanning [bus ff-ff] behind bridge, pass 0 > pci 0002:00:07.0: scanning [bus 00-00] behind bridge, pass 0 > pci 0002:00:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring > pci 0002:00:08.0: scanning [bus 01-01] behind bridge, pass 0 > pci 0002:00:09.0: scanning [bus 02-02] behind bridge, pass 0 > pci 0002:00:0a.0: scanning [bus 03-03] behind bridge, pass 0 > pci 0002:00:0b.0: scanning [bus 04-04] behind bridge, pass 0 > pci 0002:00:0c.0: scanning [bus 05-05] behind bridge, pass 0 > pci 0002:00:0d.0: scanning [bus 06-06] behind bridge, pass 0 > pci 0002:00:0e.0: scanning [bus 07-07] behind bridge, pass 0 > pci 0002:00:0f.0: scanning [bus 08-08] behind bridge, pass 0 > > This regression was introduced by commit 7246a4520b4b ("PCI: Use > preserve_config in place of pci_flags"). On our board, the 0002:00:07.0 > bridge is misconfigured by the bootloader. Both its secondary and > subordinate bus numbers are initialized to 0, while its fixed secondary > bus number is set to 8. However, bus number 8 is also assigned to another > bridge (0002:00:0f.0). Although this is a bootloader issue, before the > change in commit 7246a4520b4b, the PCI_REASSIGN_ALL_BUS flag was set > by default when PCI_PROBE_ONLY was not enabled, ensuing that all the > bus number for these bridges were reassigned, avoiding any conflicts. > > After the change introduced in commit 7246a4520b4b, the bus numbers > assigned by the bootloader are reused by all other bridges, except > the misconfigured 0002:00:07.0 bridge. The kernel attempt to reconfigure > 0002:00:07.0 by reusing the fixed secondary bus number 8 assigned by > bootloader. However, since a pci_bus has already been allocated for > bus 8 due to the probe of 0002:00:0f.0, no new pci_bus allocated for > 0002:00:07.0. This results in a pci bridge device without a pci_bus > attached (pdev->subordinate == NULL). Consequently, accessing > pdev->subordinate in of_pci_prop_bus_range() leads to a NULL pointer > dereference. > > To summarize, we need to set the PCI_REASSIGN_ALL_BUS flag when > PCI_PROBE_ONLY is not enabled in order to work around issue like the > one described above. > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags") > Signed-off-by: Bo Sun <Bo.Sun.CN@xxxxxxxxxxxxx> > --- > Changes in v2: > - Added explicit comment about the quirk, as requested by Mani. > - Made commit message more clear, as requested by Bjorn. > > drivers/pci/quirks.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 82b21e34c545..cec58c7479e1 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -6181,6 +6181,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect); > > +/* > + * Quirk for Marvell CN96XX/CN10XXX boards: > + * > + * Adds PCI_REASSIGN_ALL_BUS unless PCI_PROBE_ONLY is set, forcing bus number > + * reassignment to avoid conflicts caused by bootloader misconfigured PCI bridges. > + * Do we really need to care about PCI_PROBE_ONLY in the quirk? Why can't we make it unconditional? > + * This resolves a regression introduced by commit 7246a4520b4b ("PCI: Use > + * preserve_config in place of pci_flags"), which removed this behavior. I don't think mentioning the commit is really needed here. - Mani -- மணிவண்ணன் சதாசிவம்