On Tue, Oct 22, 2019 at 6:57 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > syzkaller managed to trigger the following crash: > > [...] > BUG: unable to handle page fault for address: ffffc90001923030 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD aa551067 P4D aa551067 PUD aa552067 PMD a572b067 PTE 80000000a1173163 > Oops: 0000 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 7982 Comm: syz-executor912 Not tainted 5.4.0-rc3+ #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > RIP: 0010:bpf_jit_binary_hdr include/linux/filter.h:787 [inline] > RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:531 [inline] > RIP: 0010:bpf_tree_comp kernel/bpf/core.c:600 [inline] > RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline] > RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline] > RIP: 0010:bpf_prog_kallsyms_find kernel/bpf/core.c:674 [inline] > RIP: 0010:is_bpf_text_address+0x184/0x3b0 kernel/bpf/core.c:709 > After further debugging it turns out that we walk kallsyms while in parallel > we tear down a BPF program which contains subprograms that have been JITed > though the program itself has not been fully exposed and is eventually bailing > out with error. > > The bpf_prog_kallsyms_del_subprogs() in bpf_prog_load()'s error path removes > the symbols, however, bpf_prog_free() tears down the JIT memory too early via > scheduled work. Instead, it needs to properly respect RCU grace period as the > kallsyms walk for BPF is under RCU. > > Fix it by refactoring __bpf_prog_put()'s tear down and reuse it in our error > path where we defer final destruction when we have subprogs in the program. > > Fixes: 7d1982b4e335 ("bpf: fix panic in prog load calls cleanup") > Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") > Reported-and-tested-by: syzbot+710043c5d1d5b5013bc7@xxxxxxxxxxxxxxxxxxxxxxxxx > Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> Applied. Thanks!