> > On Mon, Jul 22, 2024 at 11:38:31AM +0800, Z qiang wrote: > > > > > > On Sun, Jul 21, 2024 at 05:53:21AM -0700, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: 51835949dda3 Merge tag 'net-next-6.11' of git://git.kernel.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=150e825e980000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=28bac69fa31fbb3a > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=784d0a1246a539975f05 > > > > compiler: aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > > > userspace arch: arm64 > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15d4bf4e980000 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17a3c349980000 > > > > > > > > Downloadable assets: > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-51835949.raw.xz > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/7a3a01db5542/vmlinux-51835949.xz > > > > kernel image: https://storage.googleapis.com/syzbot-assets/14d329019155/Image-51835949.gz.xz > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+784d0a1246a539975f05@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > > > ------------[ cut here ]------------ > > > > Voluntary context switch within RCU read-side critical section! > > > > WARNING: CPU: 0 PID: 3460 at kernel/rcu/tree_plugin.h:330 rcu_note_context_switch+0x354/0x49c kernel/rcu/tree_plugin.h:330 > > > > > > Taking a voluntary context switch in an RCU read-side critical section > > > voids your RCU warranty. > > > > > > > Modules linked in: > > > > CPU: 0 PID: 3460 Comm: syz-executor248 Not tainted 6.10.0-syzkaller-04472-g51835949dda3 #0 > > > > Hardware name: linux,dummy-virt (DT) > > > > pstate: 614000c9 (nZCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) > > > > pc : rcu_note_context_switch+0x354/0x49c kernel/rcu/tree_plugin.h:330 > > > > lr : rcu_note_context_switch+0x354/0x49c kernel/rcu/tree_plugin.h:330 > > > > sp : ffff800089523d30 > > > > x29: ffff800089523d30 x28: f6f0000005d4a480 x27: 0000000000000000 > > > > x26: 0000000000000000 x25: f6f0000005d4a480 x24: ffff800082643318 > > > > x23: 0000000000000000 x22: f6f0000005d4a480 x21: fff000007f8d6240 > > > > x20: ffff80008261e040 x19: fff000007f8d7040 x18: fffffffffffcb658 > > > > x17: fff07ffffd2b9000 x16: ffff800080000000 x15: 0000000000000048 > > > > x14: fffffffffffcb6a0 x13: ffff80008266b0a8 x12: 000000000000088b > > > > x11: 00000000000002d9 x10: ffff80008271f500 x9 : ffff80008266b0a8 > > > > x8 : 00000000ffffdfff x7 : ffff80008271b0a8 x6 : 00000000000002d9 > > > > x5 : fff000007f8cbf48 x4 : 40000000ffffe2d9 x3 : fff07ffffd2b9000 > > > > x2 : 0000000000000000 x1 : 0000000000000000 x0 : f6f0000005d4a480 > > > > Call trace: > > > > rcu_note_context_switch+0x354/0x49c kernel/rcu/tree_plugin.h:330 > > > > __schedule+0xb0/0x850 kernel/sched/core.c:6417 > > > > __schedule_loop kernel/sched/core.c:6606 [inline] > > > > schedule+0x34/0x104 kernel/sched/core.c:6621 > > > > do_notify_resume+0xe4/0x164 arch/arm64/kernel/entry-common.c:136 > > > > exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline] > > > > exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline] > > > > el0_interrupt+0xc4/0xc8 arch/arm64/kernel/entry-common.c:797 > > > > __el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:802 > > > > el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:807 > > > > el0t_64_irq+0x19c/0x1a0 arch/arm64/kernel/entry.S:599 > > > > ---[ end trace 0000000000000000 ]--- > > > > > > If we are exiting to user mode, my first guess would be that someone > > > did rcu_read_lock() but forgot the matching rcu_read_unlock(). > > > > > > Can this be reproduced using lockdep? That would pinpoint the unmatched > > > rcu_read_lock(). > > > > This should be caused by this modification (commit id: > > ca567df74a28a9fb368c6b2d93e864113f73f5c2) > > when tsk is null, miss invoke rcu_read_unlock() for NS_GET_TGID_IN_PIDNS. > > Very good, and it looks like that to me as well. Would you like to > submit a fix patch and see if syzbot agrees? I see there is a c test program(https://syzkaller.appspot.com/x/repro.c?x=17a3c349980000), I will run this test on my local machine, and then make a fix. Thanks Zqiang > > Thanx, Paul > > > Thanks > > Zqiang > > > > > > > > > > Thanx, Paul > > > > > > > --- > > > > This report is generated by a bot. It may contain errors. > > > > See https://goo.gl/tpsmEJ for more information about syzbot. > > > > syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx. > > > > > > > > syzbot will keep track of this issue. See: > > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > > > > > If the report is already addressed, let syzbot know by replying with: > > > > #syz fix: exact-commit-title > > > > > > > > If you want syzbot to run the reproducer, reply with: > > > > #syz test: git://repo/address.git branch-or-commit-hash > > > > If you attach or paste a git patch, syzbot will apply it before testing. > > > > > > > > If you want to overwrite report's subsystems, reply with: > > > > #syz set subsystems: new-subsystem > > > > (See the list of subsystem names on the web dashboard) > > > > > > > > If the report is a duplicate of another one, reply with: > > > > #syz dup: exact-subject-of-another-report > > > > > > > > If you want to undo deduplication, reply with: > > > > #syz undup