Jeff Layton <jlayton@xxxxxxxxxx> writes: > Thanks for the confirmation. There were some oopses on some RHEL8 (5.14 > based kernels). The stack looked something like this: > > PID: 2415716 TASK: ffff937139090000 CPU: 3 COMMAND: "ls" > #0 [ffff9ef946b23728] machine_kexec at ffffffffac867cfe > #1 [ffff9ef946b23780] __crash_kexec at ffffffffac9ad94d > #2 [ffff9ef946b23848] crash_kexec at ffffffffac9ae881 > #3 [ffff9ef946b23860] oops_end at ffffffffac8274f1 > #4 [ffff9ef946b23880] no_context at ffffffffac879a03 > #5 [ffff9ef946b238d8] __bad_area_nosemaphore at ffffffffac879d64 > #6 [ffff9ef946b23920] do_page_fault at ffffffffac87a617 > #7 [ffff9ef946b23950] page_fault at ffffffffad20111e > [exception RIP: cifs_mount+1126] > RIP: ffffffffc08e8826 RSP: ffff9ef946b23a00 RFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff936c8221ea00 RCX: ffff936f8018b320 > RDX: 0000000000000001 RSI: ffff936f8018b420 RDI: ffff936c8221ea00 > RBP: ffff9ef946b23a90 R8: 5346445756535c5c R9: 6765642e50313031 > R10: 65622e666f6f7267 R11: 0063696c6275505c R12: ffff9370b192fc00 > R13: ffff936f8018b420 R14: 00000000003097ad R15: 0000000000000000 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ffff9ef946b23a98] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > #9 [ffff9ef946b23aa0] cifs_smb3_do_mount at ffffffffc08d11f2 [cifs] > #10 [ffff9ef946b23b08] smb3_get_tree at ffffffffc0930ae0 [cifs] > #11 [ffff9ef946b23b30] vfs_get_tree at ffffffffacb52365 > #12 [ffff9ef946b23b50] fc_mount at ffffffffacb7485e > #13 [ffff9ef946b23b60] vfs_kern_mount at ffffffffacb748ec > #14 [ffff9ef946b23b80] cifs_dfs_do_automount at ffffffffc093552e [cifs] > #15 [ffff9ef946b23bc0] cifs_dfs_d_automount at ffffffffc0935880 [cifs] > #16 [ffff9ef946b23bd0] follow_managed at ffffffffacb5bdaf > #17 [ffff9ef946b23c10] lookup_fast at ffffffffacb5c7e5 > #18 [ffff9ef946b23c68] walk_component at ffffffffacb5d258 > #19 [ffff9ef946b23cc8] path_lookupat at ffffffffacb5e215 > #20 [ffff9ef946b23d28] filename_lookup at ffffffffacb62710 > #21 [ffff9ef946b23e40] vfs_statx at ffffffffacb55874 > #22 [ffff9ef946b23e98] __do_sys_statx at ffffffffacb5692b > #23 [ffff9ef946b23f38] do_syscall_64 at ffffffffac8043ab > #24 [ffff9ef946b23f50] entry_SYSCALL_64_after_hwframe at > ffffffffad2000a9 > RIP: 00007ff6a2637edf RSP: 00007ffe040017d0 RFLAGS: 00000246 > RAX: ffffffffffffffda RBX: 00007ffe04001910 RCX: 00007ff6a2637edf > RDX: 0000000000000100 RSI: 00007ffe04001910 RDI: 00000000ffffff9c > RBP: 0000000000000100 R8: 00007ffe040017f0 R9: 00000000ffffff9c > R10: 0000000000000002 R11: 0000000000000246 R12: 00007ffe040017f0 > R13: 0000000000000000 R14: 0000000000000003 R15: 000055ce1a3ae1b8 > ORIG_RAX: 000000000000014c CS: 0033 SS: 002b > > Analysis of the vmcore by Roberto showed that we had ended up past that > point with tcon==NULL and rc==0. Interesting. Thanks for sharing the backtrace! So it actually ended up with NULL @tcon and rc == 0 while mounting a DFS link. Nice catch! > Steve's patch would have fixed the panic there, but I think the host > would have ended up with a successful mount, but with a broken > superblock. The current code seems a bit less fragile, and I didn't see > any similar brokenness there, but I didn't look too hard either. Yeah, makes sense. > In any case, we'll plan to fix this up with a one-off in RHEL/Centos. > Thanks again for the sanity check! Would you mind to propose a patch that fixes the above and mark it for v5.14..v5.15?