On 2/9/25 11:32 PM, Harshit Mogalapalli wrote: > Hello, > > On 09/02/25 21:27, Chuck Lever wrote: >> On 2/7/25 10:10 AM, Greg KH wrote: >>> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote: >>>> Hi - >>>> >>>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been >>>> failing. I >>>> looked into it today, and the test guest fails to reboot because it >>>> panics during a reboot shutdown: >>>> >>>> [ 146.793087] BUG: unable to handle page fault for address: >>>> ffffffffffffffe8 >>>> [ 146.793918] #PF: supervisor read access in kernel mode >>>> [ 146.794544] #PF: error_code(0x0000) - not-present page >>>> [ 146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0 >>>> [ 146.795865] Oops: 0000 [#1] SMP NOPTI >>>> [ 146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted >>>> 5.10.234-g99349f441fe1 #1 >>>> [ 146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS >>>> 1.16.3-2.fc40 04/01/2014 >>>> [ 146.798267] RIP: 0010:platform_shutdown+0x9/0x20 >>>> [ 146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff >>>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 >>>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc >>>> [ 146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246 >>>> [ 146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX: >>>> 0000000000000000 >>>> [ 146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI: >>>> ff4f0637469df410 >>>> [ 146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09: >>>> ffffffffb2c5c698 >>>> [ 146.804203] R10: 0000000000000000 R11: 0000000000000000 R12: >>>> ff4f0637469df410 >>>> [ 146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15: >>>> 0000000000000000 >>>> [ 146.805909] FS: 00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000) >>>> knlGS:0000000000000000 >>>> [ 146.806866] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4: >>>> 0000000000771ee0 >>>> [ 146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>>> 0000000000000000 >>>> [ 146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>>> 0000000000000400 >>>> [ 146.810109] PKRU: 55555554 >>>> [ 146.810460] Call Trace: >>>> [ 146.810791] ? __die_body.cold+0x1a/0x1f >>>> [ 146.811282] ? no_context.constprop.0+0xf8/0x2f0 >>>> [ 146.811854] ? exc_page_fault+0xc5/0x150 >>>> [ 146.812342] ? asm_exc_page_fault+0x1e/0x30 >>>> [ 146.812862] ? platform_shutdown+0x9/0x20 >>>> [ 146.813362] device_shutdown+0x158/0x1c0 >>>> [ 146.813853] __do_sys_reboot.cold+0x2f/0x5b >>>> [ 146.814370] ? vfs_writev+0x9b/0x110 >>>> [ 146.814824] ? do_writev+0x57/0xf0 >>>> [ 146.815254] do_syscall_64+0x30/0x40 >>>> [ 146.815708] entry_SYSCALL_64_after_hwframe+0x67/0xd1 >>>> >>>> Let me know how to further assist. >>> >>> Bisect? >> >> First bad commit: >> >> commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c >> Author: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> >> AuthorDate: Thu Nov 19 13:46:11 2020 +0100 >> Commit: Sasha Levin <sashal@xxxxxxxxxx> >> CommitDate: Tue Feb 4 13:04:31 2025 -0500 >> >> driver core: platform: use bus_type functions >> >> [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ] >> >> This works towards the goal mentioned in 2006 in commit 594c8281f905 >> ("[PATCH] Add bus_type probe, remove, shutdown methods."). >> >> The functions are moved to where the other bus_type functions are >> defined and renamed to match the already established naming scheme. >> >> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> >> Link: >> https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine- >> koenig@xxxxxxxxxxxxxx >> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> >> Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF >> node reference leak") >> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> >> > > While one option is to drop this, maybe we apply this below fix as well > instead of dropping the above as it is pulled in as stable-dep-of for > some other commit? > > commit 46e85af0cc53f35584e00bb5db7db6893d0e16e5 > Author: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> > Date: Sun Dec 13 02:55:33 2020 +0300 > > driver core: platform: don't oops in platform_shutdown() on unbound > devices > > On shutdown the driver core calls the bus' shutdown callback also for > unbound devices. A driver's shutdown callback however is only called > for > devices bound to this driver. Commit 9c30921fe799 ("driver core: > platform: use bus_type functions") changed the platform bus from driver > callbacks to bus callbacks, so the shutdown function must be > prepared to > be called without a driver. Add the corresponding check in the shutdown > function. > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions") > Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx> > Reviewed-by: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> > Link: https://lore.kernel.org/r/20201212235533.247537-1- > dmitry.baryshkov@xxxxxxxxxx > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > > This commit talks about fixing an oops in platform_shutdown() > > Thanks, > Harshit > I was about to test this idea, but 46e85af0cc53 does not apply cleanly to origin/linux-5.10.y. Someone with more local expertise will need to have a look. -- Chuck Lever