Hi Namhyung Kim and bpf expert, Greetings! There is deadlock in __bpf_ringbuf_reserve in v6.10 Found the first bad commit: ee042be16cb4 locking: Apply contention tracepoints in the slow path All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240717_170536___bpf_ringbuf_reserve Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240717_170536___bpf_ringbuf_reserve/repro.c Syzkaller repro syscall: https://github.com/xupengfe/syzkaller_logs/blob/main/240717_170536___bpf_ringbuf_reserve/repro.prog Syzkaller report: https://github.com/xupengfe/syzkaller_logs/blob/main/240717_170536___bpf_ringbuf_reserve/repro.report Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240717_170536___bpf_ringbuf_reserve/kconfig_origin Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240717_170536___bpf_ringbuf_reserve/bisect_info.log v6.10 bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240717_170536___bpf_ringbuf_reserve/bzImage_0c3836482481200ead7b416ca80c68a29cfdaabd.tar.gz Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240717_170536___bpf_ringbuf_reserve/0c3836482481200ead7b416ca80c68a29cfdaabd_dmesg.log " [ 25.063013] [ 25.063211] ============================================ [ 25.063694] WARNING: possible recursive locking detected [ 25.064165] 6.10.0-0c3836482481 #1 Tainted: G W [ 25.064787] -------------------------------------------- [ 25.065264] repro/745 is trying to acquire lock: [ 25.065693] ffffc90004f1a0d8 (&rb->spinlock){-.-.}-{2:2}, at: __bpf_ringbuf_reserve+0x386/0x460 [ 25.066517] [ 25.066517] but task is already holding lock: [ 25.067054] ffffc900018360d8 (&rb->spinlock){-.-.}-{2:2}, at: __bpf_ringbuf_reserve+0x386/0x460 [ 25.067878] [ 25.067878] other info that might help us debug this: [ 25.068504] Possible unsafe locking scenario: [ 25.068504] [ 25.069061] CPU0 [ 25.069301] ---- [ 25.069540] lock(&rb->spinlock); [ 25.069879] lock(&rb->spinlock); [ 25.070208] [ 25.070208] *** DEADLOCK *** [ 25.070208] [ 25.070741] May be due to missing lock nesting notation [ 25.070741] [ 25.071362] 4 locks held by repro/745: [ 25.071731] #0: ffffffff86fff388 (pcpu_alloc_mutex){+.+.}-{3:3}, at: pcpu_alloc_noprof+0xa07/0x1120 [ 25.072674] #1: ffffffff86e58de0 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0x1b7/0x5a0 [ 25.073493] #2: ffffc900018360d8 (&rb->spinlock){-.-.}-{2:2}, at: __bpf_ringbuf_reserve+0x386/0x460 [ 25.074359] #3: ffffffff86e58de0 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0x1b7/0x5a0 [ 25.075180] [ 25.075180] stack backtrace: [ 25.075587] CPU: 0 PID: 745 Comm: repro Tainted: G W 6.10.0-0c3836482481 #1 [ 25.076373] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 25.077661] Call Trace: [ 25.078033] <TASK> [ 25.078253] dump_stack_lvl+0xea/0x150 [ 25.078650] dump_stack+0x19/0x20 [ 25.079003] print_deadlock_bug+0x3c0/0x680 [ 25.079417] __lock_acquire+0x2b2a/0x5ca0 [ 25.079829] ? __pfx___lock_acquire+0x10/0x10 [ 25.080270] ? __kasan_check_read+0x15/0x20 [ 25.080693] ? __lock_acquire+0xccf/0x5ca0 [ 25.081101] lock_acquire+0x1ce/0x580 [ 25.081472] ? __bpf_ringbuf_reserve+0x386/0x460 [ 25.081926] ? __pfx_lock_acquire+0x10/0x10 [ 25.082343] ? __kasan_check_read+0x15/0x20 [ 25.082770] _raw_spin_lock_irqsave+0x52/0x80 [ 25.083202] ? __bpf_ringbuf_reserve+0x386/0x460 [ 25.083920] __bpf_ringbuf_reserve+0x386/0x460 [ 25.084487] bpf_ringbuf_reserve+0x63/0xa0 [ 25.084904] bpf_prog_9efe54833449f08e+0x2d/0x47 [ 25.085383] bpf_trace_run2+0x238/0x5a0 [ 25.085784] ? __pfx_bpf_trace_run2+0x10/0x10 [ 25.086237] ? __pfx___bpf_trace_contention_end+0x10/0x10 [ 25.086779] __bpf_trace_contention_end+0xf/0x20 [ 25.087230] __traceiter_contention_end+0x66/0xb0 [ 25.087697] trace_contention_end.constprop.0+0xdc/0x140 [ 25.088207] __pv_queued_spin_lock_slowpath+0x2a1/0xc80 [ 25.088751] ? __pfx___pv_queued_spin_lock_slowpath+0x10/0x10 [ 25.089369] ? __this_cpu_preempt_check+0x21/0x30 [ 25.089833] ? lock_acquire+0x1de/0x580 [ 25.090222] do_raw_spin_lock+0x1fb/0x280 [ 25.090622] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 25.091056] ? debug_smp_processor_id+0x20/0x30 [ 25.091506] ? rcu_is_watching+0x19/0xc0 [ 25.091900] _raw_spin_lock_irqsave+0x5a/0x80 [ 25.092337] ? __bpf_ringbuf_reserve+0x386/0x460 [ 25.092791] __bpf_ringbuf_reserve+0x386/0x460 [ 25.093269] bpf_ringbuf_reserve+0x63/0xa0 [ 25.093694] bpf_prog_9efe54833449f08e+0x2d/0x47 [ 25.094138] bpf_trace_run2+0x238/0x5a0 [ 25.094525] ? __pfx_bpf_trace_run2+0x10/0x10 [ 25.094963] ? lock_acquire+0x1de/0x580 [ 25.095344] ? __pfx_lock_acquire+0x10/0x10 [ 25.095766] ? __pfx___bpf_trace_contention_end+0x10/0x10 [ 25.096296] __bpf_trace_contention_end+0xf/0x20 [ 25.096755] __traceiter_contention_end+0x66/0xb0 [ 25.097245] trace_contention_end+0xc5/0x120 [ 25.097699] __mutex_lock+0x257/0x1660 [ 25.098077] ? pcpu_alloc_noprof+0xa07/0x1120 [ 25.098518] ? __pfx___lock_acquire+0x10/0x10 [ 25.098951] ? _find_first_bit+0x95/0xc0 [ 25.099340] ? __pfx___mutex_lock+0x10/0x10 [ 25.099760] ? __this_cpu_preempt_check+0x21/0x30 [ 25.100223] ? lock_release+0x418/0x840 [ 25.100638] mutex_lock_killable_nested+0x1f/0x30 [ 25.101109] ? mutex_lock_killable_nested+0x1f/0x30 [ 25.101611] pcpu_alloc_noprof+0xa07/0x1120 [ 25.102034] ? lockdep_init_map_type+0x2df/0x810 [ 25.102488] ? __raw_spin_lock_init+0x44/0x120 [ 25.102931] ? __kasan_check_write+0x18/0x20 [ 25.103352] mm_init+0x8da/0xec0 [ 25.103692] copy_mm+0x3cf/0x2550 [ 25.104040] ? __pfx_copy_mm+0x10/0x10 [ 25.104431] ? lockdep_init_map_type+0x2df/0x810 [ 25.104901] ? __raw_spin_lock_init+0x44/0x120 [ 25.105371] copy_process+0x361c/0x6a60 [ 25.105776] ? __pfx_copy_process+0x10/0x10 [ 25.106194] ? __kasan_check_read+0x15/0x20 [ 25.106607] ? __lock_acquire+0x1a02/0x5ca0 [ 25.107033] kernel_clone+0xfd/0x8d0 [ 25.107396] ? __pfx_kernel_wait4+0x10/0x10 [ 25.107811] ? __pfx_kernel_clone+0x10/0x10 [ 25.108214] ? __this_cpu_preempt_check+0x21/0x30 [ 25.108736] ? lock_release+0x418/0x840 [ 25.109144] __do_sys_clone+0xe1/0x120 [ 25.109529] ? __pfx___do_sys_clone+0x10/0x10 [ 25.109999] __x64_sys_clone+0xc7/0x150 [ 25.110375] ? syscall_trace_enter+0x14a/0x230 [ 25.110815] x64_sys_call+0x1e76/0x20d0 [ 25.111188] do_syscall_64+0x6d/0x140 [ 25.111559] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 25.112045] RIP: 0033:0x7f6219f189d7 [ 25.112415] Code: 00 00 00 f3 0f 1e fa 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 39 41 89 c0 85 c0 75 2a 64 48 8b 04 25 10 00 [ 25.114082] RSP: 002b:00007fff149665d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038 [ 25.115078] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f6219f189d7 [ 25.115792] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 [ 25.116487] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000007194fa985 [ 25.117136] R10: 00007f621a028a10 R11: 0000000000000246 R12: 0000000000000000 [ 25.117793] R13: 0000000000401e31 R14: 0000000000403e08 R15: 00007f621a073000 [ 25.118449] </TASK> " Thank you! --- If you don't need the following environment to reproduce the problem or if you already have one reproduced environment, please ignore the following information. How to reproduce: git clone https://gitlab.com/xupengfe/repro_vm_env.git cd repro_vm_env tar -xvf repro_vm_env.tar.gz cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0 // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel // You could change the bzImage_xxx as you want // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version You could use below command to log in, there is no password for root. ssh -p 10023 root@localhost After login vm(virtual machine) successfully, you could transfer reproduced binary to the vm by below way, and reproduce the problem in vm: gcc -pthread -o repro repro.c scp -P 10023 repro root@localhost:/root/ Get the bzImage for target kernel: Please use target kconfig and copy it to kernel_src/.config make olddefconfig make -jx bzImage //x should equal or less than cpu num your pc has Fill the bzImage file into above start3.sh to load the target kernel in vm. Tips: If you already have qemu-system-x86_64, please ignore below info. If you want to install qemu v7.1.0 version: git clone https://github.com/qemu/qemu.git cd qemu git checkout -f v7.1.0 mkdir build cd build yum install -y ninja-build.x86_64 yum -y install libslirp-devel.x86_64 ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp make make install Best Regards, Thanks!