On 2024/7/11 22:58, Tengda Wu wrote: > When loading a EXT program without specifying `attr->attach_prog_fd`, > the `prog->aux->dst_prog` will be null. At this time, calling > resolve_prog_type() anywhere will result in a null pointer dereference. > > Example stack trace: > > [ 8.107863] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 > [ 8.108262] Mem abort info: > [ 8.108384] ESR = 0x0000000096000004 > [ 8.108547] EC = 0x25: DABT (current EL), IL = 32 bits > [ 8.108722] SET = 0, FnV = 0 > [ 8.108827] EA = 0, S1PTW = 0 > [ 8.108939] FSC = 0x04: level 0 translation fault > [ 8.109102] Data abort info: > [ 8.109203] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > [ 8.109399] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [ 8.109614] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [ 8.109836] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101354000 > [ 8.110011] [0000000000000004] pgd=0000000000000000, p4d=0000000000000000 > [ 8.112624] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 8.112783] Modules linked in: > [ 8.113120] CPU: 0 PID: 99 Comm: may_access_dire Not tainted 6.10.0-rc3-next-20240613-dirty #1 > [ 8.113230] Hardware name: linux,dummy-virt (DT) > [ 8.113390] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 8.113429] pc : may_access_direct_pkt_data+0x24/0xa0 > [ 8.113746] lr : add_subprog_and_kfunc+0x634/0x8e8 > [ 8.113798] sp : ffff80008283b9f0 > [ 8.113813] x29: ffff80008283b9f0 x28: ffff800082795048 x27: 0000000000000001 > [ 8.113881] x26: ffff0000c0bb2600 x25: 0000000000000000 x24: 0000000000000000 > [ 8.113897] x23: ffff0000c1134000 x22: 000000000001864f x21: ffff0000c1138000 > [ 8.113912] x20: 0000000000000001 x19: ffff0000c12b8000 x18: ffffffffffffffff > [ 8.113929] x17: 0000000000000000 x16: 0000000000000000 x15: 0720072007200720 > [ 8.113944] x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 > [ 8.113958] x11: 0720072007200720 x10: 0000000000f9fca4 x9 : ffff80008021f4e4 > [ 8.113991] x8 : 0101010101010101 x7 : 746f72705f6d656d x6 : 000000001e0e0f5f > [ 8.114006] x5 : 000000000001864f x4 : ffff0000c12b8000 x3 : 000000000000001c > [ 8.114020] x2 : 0000000000000002 x1 : 0000000000000000 x0 : 0000000000000000 > [ 8.114126] Call trace: > [ 8.114159] may_access_direct_pkt_data+0x24/0xa0 > [ 8.114202] bpf_check+0x3bc/0x28c0 > [ 8.114214] bpf_prog_load+0x658/0xa58 > [ 8.114227] __sys_bpf+0xc50/0x2250 > [ 8.114240] __arm64_sys_bpf+0x28/0x40 > [ 8.114254] invoke_syscall.constprop.0+0x54/0xf0 > [ 8.114273] do_el0_svc+0x4c/0xd8 > [ 8.114289] el0_svc+0x3c/0x140 > [ 8.114305] el0t_64_sync_handler+0x134/0x150 > [ 8.114331] el0t_64_sync+0x168/0x170 > [ 8.114477] Code: 7100707f 54000081 f9401c00 f9403800 (b9400403) > [ 8.118672] ---[ end trace 0000000000000000 ]--- > > One way to fix it is by forcing `attach_prog_fd` non-empty when > bpf_prog_load(). But this will lead to `libbpf_probe_bpf_prog_type` > API broken which use verifier log to probe prog type and will log > nothing if we reject invalid EXT prog before bpf_check(). > > Another way is by adding null check in resolve_prog_type(). > > The issue was introduced by commit 4a9c7bbe2ed4 ("bpf: Resolve to > prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT") which wanted > to correct type resolution for BPF_PROG_TYPE_TRACING programs. Before > that, the type resolution of BPF_PROG_TYPE_EXT prog actually follows > the logic below: > > prog->aux->dst_prog ? prog->aux->dst_prog->type : prog->type; > > It implies that when EXT program is not yet attached to `dst_prog`, > the prog type should be EXT itself. This code worked fine in the past. > So just keep using it. > > Fix this by returning `prog->type` for BPF_PROG_TYPE_EXT if `dst_prog` > is not present in resolve_prog_type(). > > Fixes: 4a9c7bbe2ed4 ("bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT") > Cc: stable@xxxxxxxxxxxxxxx # v5.18+ > Signed-off-by: Tengda Wu <wutengda@xxxxxxxxxxxxxxx> > --- > include/linux/bpf_verifier.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index e4070fb02b11..ff2a6cdb1fa3 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -846,7 +846,7 @@ static inline u32 type_flag(u32 type) > /* only use after check_attach_btf_id() */ > static inline enum bpf_prog_type resolve_prog_type(const struct bpf_prog *prog) > { > - return prog->type == BPF_PROG_TYPE_EXT ? > + return (prog->type == BPF_PROG_TYPE_EXT && prog->aux->dst_prog) ? > prog->aux->dst_prog->type : prog->type; > } > It seems not good enough here. When I want to update an attached freplace prog to PROG_ARRAY map, it will fail, like this selftest[0]. In this selftest, there is a PROG_ARRAY map in freplace prog. And when loading with freplace's target BPF_PROG_TYPE_SCHED_CLS prog, PROG_ARRAY map owner type is set as BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog, when I want to update the freplace prog, it will fail because resolve_prog_type() returns BPF_PROG_TYPE_EXT (because freplace prog->aux->dst_prog is NULL after attaching). When to run the selftest: ./test_progs -v -t tailcalls/tailcall_freplace test_tailcall_freplace:PASS:open fr_skel 0 nsec test_tailcall_freplace:PASS:open tc_skel 0 nsec test_tailcall_freplace:PASS:tc_skel entry prog_id 0 nsec test_tailcall_freplace:PASS:set_attach_target 0 nsec test_tailcall_freplace:PASS:load fr_skel 0 nsec test_tailcall_freplace:PASS:attatch_freplace 0 nsec test_tailcall_freplace:PASS:fr_skel entry prog_fd 0 nsec test_tailcall_freplace:PASS:fr_skel jmp_table map_fd 0 nsec test_tailcall_freplace:FAIL:update jmp_table unexpected error: -22 (errno 22) #327/25 tailcalls/tailcall_freplace:FAIL It is better to fix by my way[1], which passes all selftests[2]. When to run the selftest with my way: ./test_progs -v -t tailcalls/tailcall_freplace test_tailcall_freplace:PASS:open fr_skel 0 nsec test_tailcall_freplace:PASS:open tc_skel 0 nsec test_tailcall_freplace:PASS:tc_skel entry prog_id 0 nsec test_tailcall_freplace:PASS:set_attach_target 0 nsec test_tailcall_freplace:PASS:load fr_skel 0 nsec test_tailcall_freplace:PASS:attatch_freplace 0 nsec test_tailcall_freplace:PASS:fr_skel entry prog_fd 0 nsec test_tailcall_freplace:PASS:fr_skel jmp_table map_fd 0 nsec test_tailcall_freplace:PASS:update jmp_table 0 nsec test_tailcall_freplace:PASS:prog_fd 0 nsec test_tailcall_freplace:PASS:test_run 0 nsec test_tailcall_freplace:PASS:test_run retval 0 nsec #327/25 tailcalls/tailcall_freplace:OK [0] https://lore.kernel.org/bpf/20240602122421.50892-3-hffilwlqm@xxxxxxxxx/ [1] https://lore.kernel.org/bpf/20240602122421.50892-2-hffilwlqm@xxxxxxxxx/ [2] https://github.com/kernel-patches/bpf/pull/7432/checks Thanks, Leon