On 26/11/2024 21:04, Brian Norris wrote:
of_find_device_by_node() doesn't like a NULL pointer, and may end up
identifying an arbitrary device, which we then start tearing down. We
should check for NULL first.
Resolves issues seen when doing `echo 1 > /sys/bus/pci/devices/.../remove`:
[ 222.952201] ------------[ cut here ]------------
[ 222.952218] WARNING: CPU: 0 PID: 5095 at drivers/regulator/core.c:5885 regulator_unregister+0x140/0x160
...
[ 222.953490] CPU: 0 UID: 0 PID: 5095 Comm: bash Tainted: G C 6.12.0-rc1 #3
...
[ 222.954134] Call trace:
[ 222.954150] regulator_unregister+0x140/0x160
[ 222.954186] devm_rdev_release+0x1c/0x30
[ 222.954215] release_nodes+0x68/0x100
[ 222.954249] devres_release_all+0x98/0xf8
[ 222.954282] device_unbind_cleanup+0x20/0x70
[ 222.954306] device_release_driver_internal+0x1f4/0x240
[ 222.954333] device_release_driver+0x20/0x40
[ 222.954358] bus_remove_device+0xd8/0x170
[ 222.954393] device_del+0x154/0x380
[ 222.954422] device_unregister+0x28/0x88
[ 222.954451] of_device_unregister+0x1c/0x30
[ 222.954488] pci_stop_bus_device+0x154/0x1b0
[ 222.954521] pci_stop_and_remove_bus_device_locked+0x28/0x48
[ 222.954553] remove_store+0xa0/0xb8
[ 222.954589] dev_attr_store+0x20/0x40
[ 222.954615] sysfs_kf_write+0x4c/0x68
[ 222.954644] kernfs_fop_write_iter+0x128/0x200
[ 222.954670] do_iter_readv_writev+0xdc/0x1e0
[ 222.954709] vfs_writev+0x100/0x2a0
[ 222.954742] do_writev+0x84/0x130
[ 222.954773] __arm64_sys_writev+0x28/0x40
[ 222.954808] invoke_syscall+0x50/0x120
[ 222.954845] el0_svc_common.constprop.0+0x48/0xf0
[ 222.954878] do_el0_svc+0x24/0x38
[ 222.954910] el0_svc+0x34/0xe0
[ 222.954945] el0t_64_sync_handler+0x120/0x138
[ 222.954978] el0t_64_sync+0x190/0x198
[ 222.955006] ---[ end trace 0000000000000000 ]---
[ 222.965216] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
...
[ 223.107395] CPU: 3 UID: 0 PID: 5095 Comm: bash Tainted: G WC 6.12.0-rc1 #3
...
[ 223.227750] Call trace:
[ 223.230501] pci_stop_bus_device+0x190/0x1b0
[ 223.235314] pci_stop_and_remove_bus_device_locked+0x28/0x48
[ 223.241672] remove_store+0xa0/0xb8
[ 223.245616] dev_attr_store+0x20/0x40
[ 223.249737] sysfs_kf_write+0x4c/0x68
[ 223.253859] kernfs_fop_write_iter+0x128/0x200
[ 223.253887] do_iter_readv_writev+0xdc/0x1e0
[ 223.263631] vfs_writev+0x100/0x2a0
[ 223.267550] do_writev+0x84/0x130
[ 223.271273] __arm64_sys_writev+0x28/0x40
[ 223.275774] invoke_syscall+0x50/0x120
[ 223.279988] el0_svc_common.constprop.0+0x48/0xf0
[ 223.285270] do_el0_svc+0x24/0x38
[ 223.288993] el0_svc+0x34/0xe0
[ 223.292426] el0t_64_sync_handler+0x120/0x138
[ 223.297311] el0t_64_sync+0x190/0x198
[ 223.301423] Code: 17fffff8 91030000 d2800101 f9800011 (c85f7c02)
[ 223.308248] ---[ end trace 0000000000000000 ]---
Fixes: 681725afb6b9 ("PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent")
Signed-off-by: Brian Norris <briannorris@xxxxxxxxxxxx>
---
drivers/pci/remove.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 963b8d2855c1..efc37fcb73e2 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -19,14 +19,19 @@ static void pci_free_resources(struct pci_dev *dev)
static void pci_pwrctrl_unregister(struct device *dev)
{
+ struct device_node *np;
struct platform_device *pdev;
- pdev = of_find_device_by_node(dev_of_node(dev));
+ np = dev_of_node(dev);
+ if (!np)
+ return;
+
+ pdev = of_find_device_by_node(np);
if (!pdev)
return;
of_device_unregister(pdev);
- of_node_clear_flag(dev_of_node(dev), OF_POPULATED);
+ of_node_clear_flag(np, OF_POPULATED);
}
static void pci_stop_dev(struct pci_dev *dev)
This fixes a regression we have been seeing on Tegra devices. FWIW ...
Tested-by: Jon Hunter <jonathanh@xxxxxxxxxx>
Thanks!
Jon
--
nvpublic