On Mon, 24 Jul 2023 21:52:34 -0400 Woody Suwalski <terraluna977@xxxxxxxxx> wrote: > Igor Mammedov wrote: > > Woody thanks for testing, > > > > can you try following patch which will try to workaround NULL bus->self if it's > > a really cuplrit and print an extra debug information. > > Add following to kernel command line(make sure that CONFIG_DYNAMIC_DEBUG is enabled): > > > > dyndbg="file drivers/pci/access.c +p; file drivers/pci/hotplug/acpiphp_glue.c +p; file drivers/pci/bus.c +p; file drivers/pci/pci.c +p; file drivers/pci/setup-bus.c +p" ignore_loglevel > > > > What I find odd in you logs is that enable_slot() is called while native PCIe > > should be used. Additional info might help to understand what's going on: > > 1: 'lspci' output > > 2: DSDT and all SSDT ACPI tables (you can use 'acpidump -b' to get them). > > > > Signed-off-by: Igor Mammedov <imammedo@xxxxxxxxxx> [...] > > > > /** > Unfortunately the patch above does not seem to prevent the kernel crash. > Here comes the requested diagnostic info: dmesg's before and after, > choice of lspci's and acpi tables. Hope that will help :-) Looking at dmesg-6.5-debug_after.txt there aren't "BUG: kernel NULL pointer dereference" line anymore The call traces you see are induced by WARN(), which purpose is to show call path that calls enable_slot(). Let me split potential fix from debug and repost that as separate patches for you to try. I'd like to see debug output without 'fix' to track down which root port/device causes NULL pointer dereference. And hopefully in a few roundtrips figure out why old code doesn't crash. PS: What happens is that on resume firmware (likely EC), issues ACPI bus check on root ports which (bus check) is wired to acpiphp module (though pciehp module was initialized at boot to manage root ports), it's likely firmware bug. I'd guess the intent behind this was to check if PCIe devices were hotplugged while laptop has been asleep, and for some reason they didn't use native PCIe hotplug to handle that. However looking at laptop specs you can't hotplug PCIe devices via external ports. Given how old laptop is it isn't going to be fixed, so we would need a workaround or fixup DSDT to skip buscheck. The options I see is to keep old kernel as for such case, or bail out early from bus check/enable_slot since root port is managed by pciehp module (and let it handle hotplug). > Thanks, Woody > >