>On Wed, Dec 02, 2020 at 10:33:42AM +0800, Jubin Zhong wrote: >> Once kobject_init_and_add() failed, pci_slot_release() is called to >> delete slot->list from parent->slots. But slot->list is intialized >> afterwards, so we ran into the following crash: >> >> Unable to handle kernel NULL pointer dereference at virtual address >> 00000000 >> ... >> CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.4.240 #197 >> task: ffffeb398a45ef10 task.stack: ffffeb398a470000 >> PC is at __list_del_entry_valid+0x5c/0xb0 >> LR is at pci_slot_release+0x84/0xe4 >> ... >> __list_del_entry_valid+0x5c/0xb0 >> pci_slot_release+0x84/0xe4 >> kobject_put+0x184/0x1c4 >> pci_create_slot+0x17c/0x1b4 >> __pci_hp_initialize+0x68/0xa4 >> pciehp_probe+0x1a4/0x2fc >> pcie_port_probe_service+0x58/0x84 >> driver_probe_device+0x320/0x470 >> __driver_attach+0x54/0xb8 >> bus_for_each_dev+0xc8/0xf0 >> driver_attach+0x30/0x3c >> bus_add_driver+0x1b0/0x24c >> driver_register+0x9c/0xe0 >> pcie_port_service_register+0x64/0x7c >> pcied_init+0x44/0xa4 >> do_one_initcall+0x1d0/0x1f0 >> kernel_init_freeable+0x24c/0x254 >> kernel_init+0x18/0xe8 >> ret_from_fork+0x10/0x20 >> >> Fixes: 8a94644b440e ("PCI: Fix pci_create_slot() reference count >> leak") >> Signed-off-by: Jubin Zhong <zhongjubin@xxxxxxxxxx> >> Cc: stable@xxxxxxxxxxxxxxx #v4.4.235 >I'm curious how you noticed this. Did kobject_init_and_add() fail for some reason? Is there a bug there that we need to fix also? Actually not. In the very beginning, Coverity reported a RESOURCE_LEAK warning of slot memory. It should be a false alarm because pci_slot_release() is already called to free slot memory as a function hook (which Coverity could not recognize). After code review I prepared to cancel the warning, that's when I noticed the unintialized slot list is also deleted by pci_slot_release(). >And why did you mention v4.4.235 in the stable tag? 8a94644b440e wasn't merged until v5.9. If 8a94644b440e was backported to older kernels, we can't know that in general, so I think whoever backported it is responsible for watching for Fixes tags that mention it. The Coverity warning was reported on v4.4.240. I also did a little test on v4.4.240 by forcing kobject_init_and_add() to fail, that's how I got the above call stack. Since 8a94644b440e was backported to v4.4.235, I thought it should be necessary to fix it on v4.4 too. >>I updated the stable tag to "v5.9+" and applied this to pci/hotplug for v5.11, thanks! >> ---- >> v2: >> Since both slot memory and slot->list would be handled by >> pci_slot_release(), we need to make sure slot->list is properly >> initialized beforehand. >> >> v1: https://lore.kernel.org/lkml/1606288971-47927-1-git-send-email-zhongjubin@xxxxxxxxxx/ >> Two things need to be cleaned up on pci_create_slot's error path: >> 1. free slot memory >> 2. remove slot->list from its parent->slots >> This patch mistakenly took slot memory as unfreed (which is not), >> and would introduce double free problem. >> --- >> drivers/pci/slot.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c index >> 3861505..ed2077e 100644 >> --- a/drivers/pci/slot.c >> +++ b/drivers/pci/slot.c >> @@ -272,6 +272,9 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, >> goto err; >> } >> >> + INIT_LIST_HEAD(&slot->list); >> + list_add(&slot->list, &parent->slots); >> + >> err = kobject_init_and_add(&slot->kobj, &pci_slot_ktype, NULL, >> "%s", slot_name); >> if (err) { >> @@ -279,9 +282,6 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, >> goto err; >> } >> >> - INIT_LIST_HEAD(&slot->list); >> - list_add(&slot->list, &parent->slots); >> - >> down_read(&pci_bus_sem); >> list_for_each_entry(dev, &parent->devices, bus_list) >> if (PCI_SLOT(dev->devfn) == slot_nr) >> -- >> 1.8.5.6 >>