On Fri, Feb 7, 2025 at 3:45 PM Johan Hovold <johan@xxxxxxxxxx> wrote: > > On Fri, Feb 07, 2025 at 02:50:29PM +0100, Johan Hovold wrote: > > > Yeah, I hit something like this yesterday as well and did confirm that > > reverting this commit makes the problem go away. Haven't had time to dig > > much further. > > > > [ 110.522368] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > > > [ 110.855238] Call trace: > > [ 110.857861] simple_pm_bus_runtime_suspend+0x14/0x48 (P) > > [ 110.863425] pm_generic_runtime_suspend+0x2c/0x44 > > [ 110.868362] pm_runtime_force_suspend+0x54/0x100 > > [ 110.873217] dpm_run_callback+0xb4/0x228 > > [ 110.877347] device_suspend_noirq+0x70/0x2a8 > > [ 110.881844] dpm_noirq_suspend_devices+0xe0/0x230 > > [ 110.886778] dpm_suspend_noirq+0x24/0x98 > > [ 110.890904] suspend_devices_and_enter+0x368/0x678 > > [ 110.895941] pm_suspend+0x1b4/0x348 > > [ 110.899627] state_store+0x8c/0xfc > > [ 110.903228] kobj_attr_store+0x18/0x2c > > [ 110.907195] sysfs_kf_write+0x4c/0x78 > > [ 110.911074] kernfs_fop_write_iter+0x120/0x1b4 > > [ 110.915735] vfs_write+0x2ac/0x358 > > [ 110.919352] ksys_write+0x68/0xfc > > [ 110.922873] __arm64_sys_write+0x1c/0x28 > > [ 110.927002] invoke_syscall+0x48/0x110 > > [ 110.930969] el0_svc_common.constprop.0+0x40/0xe0 > > [ 110.935907] do_el0_svc+0x1c/0x28 > > [ 110.939427] el0_svc+0x48/0x114 > > [ 110.942769] el0t_64_sync_handler+0xc8/0xcc > > [ 110.947180] el0t_64_sync+0x198/0x19c > > [ 110.951059] Code: a9be7bfd 910003fd a90153f3 f9403c00 (f9400014) > > [ 110.957428] ---[ end trace 0000000000000000 ]--- > > Ok, so the driver data is never set and runtime PM is never enabled for > this simple bus device, which uses pm_runtime_force_suspend() for system > sleep. This is kind of confusing. Why use pm_runtime_force_suspend() if runtime PM is never enabled and cannot really work? > This used to work as the runtime PM state was left at 'suspended', which > makes pm_runtime_force_suspend() return early, but now we can end up > with a call to the driver runtime PM ops that dereference the NULL > driver data. Thanks for the info! pm_runtime_force_suspend() is a known weak point, but I had assumed that it wouldn't be involved in dependency chains starting at devices with DPM_FLAG_SMART_SUSPEND set. Well, more work ...