Good afternoon, After coming across a recent Syzkaller report [0] I thought I'd take some time to firstly reproduce the issue, then see if there was a trivial way to mitigate it. The report suggests that a BUG() in prog_array_map_poke_run() [1] can be trivially and reliably triggered from userspace using the PoC provided [2]. ret = bpf_arch_text_poke(poke->tailcall_bypass, BPF_MOD_JUMP, old_bypass_addr, poke->bypass_addr); BUG_ON(ret < 0 && ret != -EINVAL); Indeed the PoC does seem to be able to consistently trigger the BUG(), not only on the reported kernel (v6.1), but also on linux-next. I went to the trouble of checking LORE, but failed to find any patches which may be attempting to fix this. kernel BUG at kernel/bpf/arraymap.c:1094! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 5 PID: 45 Comm: kworker/5:0 Not tainted 6.6.0-rc3-next-20230929-dirty #74 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Workqueue: events prog_array_map_clear_deferred RIP: 0010:prog_array_map_poke_run+0x6b4/0x6d0 Code: ff 0f 0b e8 1e 27 e1 ff 48 c7 c7 60 80 93 85 48 c7 c6 00 7f 93 85 48 c7 c2 bb c2 39 86 b9 45 04 00 00 45 89 f8 e8 9c 890 RSP: 0018:ffffc9000036fb50 EFLAGS: 00010246 RAX: 0000000000000044 RBX: ffff88811f337490 RCX: 63af48a1314f9900 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffc9000036fbe8 R08: ffffffff815c23c5 R09: 1ffff11084c14eba R10: dfffe91084c14ebc R11: ffffed1084c14ebb R12: ffff888116517800 R13: dffffc0000000000 R14: ffff888125a1a400 R15: 00000000fffffff0 FS: 0000000000000000(0000) GS:ffff888426080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004ab678 CR3: 0000000122ac4000 CR4: 0000000000350eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die_body+0x92/0xf0 ? die+0xa2/0xe0 ? do_trap+0x12f/0x370 ? handle_invalid_op+0xa6/0x140 ? handle_invalid_op+0xdf/0x140 ? prog_array_map_poke_run+0x6b4/0x6d0 ? prog_array_map_poke_run+0x6b4/0x6d0 ? exc_invalid_op+0x32/0x50 ? asm_exc_invalid_op+0x1b/0x20 ? __wake_up_klogd+0xd5/0x110 ? prog_array_map_poke_run+0x6b4/0x6d0 ? bpf_prog_6781ebc2dae4bad9+0xb/0x53 fd_array_map_delete_elem+0x152/0x250 prog_array_map_clear_deferred+0xf6/0x210 ? __bpf_array_map_seq_show+0xa40/0xa40 ? kick_pool+0x164/0x350 ? process_one_work+0x57a/0xd00 process_one_work+0x5e4/0xd00 worker_thread+0x9cf/0xea0 kthread+0x2b4/0x350 ? pr_cont_work+0x580/0x580 ? kthread_blkcg+0xd0/0xd0 ret_from_fork+0x4a/0x80 ? kthread_blkcg+0xd0/0xd0 ret_from_fork_asm+0x11/0x20 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- However, with my very limited BPF subsystem knowledge I was unable to trivially fix the issue. Hopefully some knowledgable person would be kind enough to provide me with some pointers. bpf_arch_text_poke() seems to be returning -EBUSY due to a negative memcmp() result from [3]. ret = -EBUSY; mutex_lock(&text_mutex); if (memcmp(ip, old_insn, X86_PATCH_SIZE)) { goto out; [...] When spitting out the memory at those locations, this is the result: ip: e9 06 00 00 00 old_insn: 0f 1f 44 00 00 nop_insn: 0f 1f 44 00 00 As you can see, the information stored in 'ip' does not match that of the data stored in 'old_insn', causing bpf_arch_text_poke() to return early with the error -EBUSY, suggesting that the data pointed to by 'old_insn', and by extension 'prog' should have been changed when emit_call()ing, to the value of 'ip', but wasn't. It's possible for me to see what is happening, but I'm afraid finding possible causes of corruption became too time consuming on this occasion. Would anyone be able to chime in to provide their take on possible causes please? Any help would be gratefully received. [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810 [1] https://elixir.bootlin.com/linux/latest/source/kernel/bpf/arraymap.c#L1092 [2] https://syzkaller.appspot.com/text?tag=ReproC&x=1397180f680000 [3] https://elixir.bootlin.com/linux/latest/source/arch/x86/net/bpf_jit_comp.c#L387 -- Lee Jones [李琼斯]