On Tue, 15 Jun 2021 at 19:21, Tony Ambardar <tony.ambardar@xxxxxxxxx> wrote: > > On Sun, 13 Jun 2021 at 23:14, Yonghong Song <yhs@xxxxxx> wrote: > > > > On 6/12/21 5:07 PM, Tony Ambardar wrote: > > > On Fri, 11 Jun 2021 at 08:57, Yonghong Song <yhs@xxxxxx> wrote: > > >> > > >> On 6/10/21 6:02 PM, Tony Ambardar wrote: > > >>> Hello, > > >>> > > >>> I encountered an NPE and kernel Oops [1] while running the > > >>> 'test_verifier' selftest on MIPS32 with LTS kernel 5.10.41. This was > > >>> observed during development of a MIPS32 JIT but is verifier-related. > > >>> > > >>> Initial troubleshooting [2] points to an unchecked NULL dereference in > > >>> btf_type_by_id(), with an unexpected BTF type ID. The root cause is > > >>> unclear, whether source of the ID or a potential underlying BTF > > >>> problem. > > >> > > >> Do you know what is the faulty btf ID number? What is the maximum id > > >> for vmlinux BTF? > > > > > > Thanks for the suggestions, Yonghong. > > > > > > I had built/packaged bpftool for the target, which shows the maximum as: > > > > > > root@OpenWrt:~# bpftool btf dump file /sys/kernel/btf/vmlinux format > > > raw|tail -5 > > > [43179] FUNC 'pci_load_of_ranges' type_id=43178 linkage=static > > > [43180] ARRAY '(anon)' type_id=23 index_type_id=23 nr_elems=16 > > > [43181] FUNC 'pcibios_plat_dev_init' type_id=29264 linkage=static > > > [43182] FUNC 'pcibios_map_irq' type_id=29815 linkage=static > > > [43183] FUNC 'mips_pcibios_init' type_id=115 linkage=static > > > > > > After adding NULL handling and debug pr_err() to kernel_type_name(), I next see: > > > > > > root@OpenWrt:~# ./test_verifier_eb 828 > > > [ 87.196692] btf_type_by_id(btf_vmlinux, 3062497280) returns NULL > > > [ 87.196958] btf_type_by_id(btf_vmlinux, 2936995840) returns NULL > > > #828/p reference tracking: bpf_sk_release(btf_tcp_sock) FAIL > > > > > > Those large type ids make me suspect an endianness issue, even though bpftool > > > can still properly access the vmlinux BTF. Changing byte order and > > > looking up the > > > resulting type ids seems to confirm this: > > > > > > Check endianness: > > > 3062497280 -> 0xB68A0000 --swap endian--> 0x00008AB6 -> 35510 > > > bpftool btf dump file /sys/kernel/btf/vmlinux format raw|fgrep "[35510]": > > > [35510] STRUCT 'tcp_sock' size=1752 vlen=136 > > > > > > Check endianness: > > > 2936995840 -> 0xAF0F0000 --swap endian--> 0x00000FAF -> 4015 > > > bpftool btf dump file /sys/kernel/btf/vmlinux format raw|fgrep "[4015]": > > > [4015] STRUCT 'sock_common' size=112 vlen=25 > > > > > > As a further test, I repeated "test_verifier 828" across mips{32,64}{be,le} > > > systems and confirm seeing the problem only with the big-endian ones. > > > > From the above information, looks like vmlinux BTF is correct. > > Below resolve_btfids command output seems indicating the btf_id list > > is also correct. > > > > The kernel_type_name is used in a few places for verifier verbose output. > > > > $ grep kernel_type_name kernel/bpf/verifier.c > > static const char *kernel_type_name(const struct btf* btf, u32 id) > > verbose(env, "%s", > > kernel_type_name(reg->btf, reg->btf_id)); > > regno, kernel_type_name(reg->btf, > > reg->btf_id), > > kernel_type_name(btf_vmlinux, > > *arg_btf_id)); > > > > The most suspicous target is reg->btf_id, which is propagated from > > the result of bpf_sk_lookup_tcp() helper. > > > > > > > >> The involved helper is bpf_sk_release. > > >> > > >> static const struct bpf_func_proto bpf_sk_release_proto = { > > >> .func = bpf_sk_release, > > >> .gpl_only = false, > > >> .ret_type = RET_INTEGER, > > >> .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, > > >> }; > > >> > > >> Eventually, the btf_id is taken from btf_sock_ids[6] where > > >> btf_sock_ids is a kernel global variable. > > >> > > >> Could you check btf_sock_ids[6] to see whether the number > > >> makes sense? > > > > > > What I see matches the second btf_type_by_id() NULL call above: > > > [ 56.556121] btf_sock_ids[6]: 2936995840 > > > > > >> The id is computed by resolve_btfids in > > >> tools/bpf/resolve_btfids, you might add verbose mode to your linux build > > >> to get more information. > > > > > > The verbose build didn't print any details of the btf ids. Was there anything > > > special to do in invocation? I manually ran "resolve_btfids -v vmlinux" from > > > the build dir and this, strangely, gave slightly different results than bpftool > > > but not the huge endian-swapped type ids. Is this expected? > > > > > > # ./tools/bpf/resolve_btfids/resolve_btfids -v vmlinux > > > ... > > > patching addr 116: ID 35522 [tcp_sock] > > > ... > > > patching addr 112: ID 4021 [sock_common] > > > > > > Do any of the details above help narrow down things? What do you suggest > > > for next steps? > > > > We need to identify issues by dumping detailed verifier logs. > > Could you apply the following change? > > > > --- a/tools/testing/selftests/bpf/test_verifier.c > > +++ b/tools/testing/selftests/bpf/test_verifier.c > > @@ -1088,7 +1088,7 @@ static void do_test_single(struct bpf_test *test, > > bool unpriv, > > attr.insns_cnt = prog_len; > > attr.license = "GPL"; > > if (verbose) > > - attr.log_level = 1; > > + attr.log_level = 3; > > else if (expected_ret == VERBOSE_ACCEPT) > > attr.log_level = 2; > > else > > > > Run command like `./test_verifier -v 828 828`? > > > > I attached the verifier output for x86_64. > > Maybe by comparing x86 output vs. mips32 output, you can > > find which insn starts to have *wrong* verifier state > > and then we can go from there. > > I realized too late your test output must be for a different kernel version as > well as arch, as the test numbering is different and doesn't match my test: > "reference tracking: bpf_sk_release(btf_tcp_sock)". > > Given the problem is seen on big-endian and not little-systems, I applied > your patch for both mips32 variant systems and recaptured log output, > which should make for a stricter A/B comparison. I also kept my earlier > patches to catch the NULLs and print debug info. > > The logs are identical until insn #18, where the failing MIPS32BE shows: > > 18: R0_w=ptr_or_null_(null)(id=3,off=0,imm=0) > R6_w=sock(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8=????0000 > fp-16=0000mmmm fp-24=mmmmmmmm fp-32=mmmmmmmm fp-40=mmmmmmmm > fp-48=mmmmmmmm refs=2 > > while the succeed MIPS32LE test shows: > > 18: R0_w=ptr_or_null_tcp_sock(id=3,off=0,imm=0) > R6_w=sock(id=0,ref_obj_id=2,off=0,imm=0) R10=fp0 fp-8=????0000 > fp-16=0000mmmm fp-24=mmmmmmmm fp-32=mmmmmmmm fp-40=mmmmmmmm > fp-48=mmmmmmmm refs=2 > > There are then further differences you can see in the attached logs. It's > not clear to me what these differences mean however. Any ideas? > > Following your earlier comments on the large, endian-swapped values > in btf_sock_ids[6], I noticed this is true of all btf_sock_ids[] > elements, based on debug output: > > btf_sock_ids[0] = 2139684864 > btf_sock_ids[1] = 2794061824 > btf_sock_ids[2] = 2844459008 > btf_sock_ids[3] = 1234305024 > btf_sock_ids[4] = 3809411072 > btf_sock_ids[5] = 1946812416 > btf_sock_ids[6] = 2936995840 > btf_sock_ids[7] = 3062497280 > btf_sock_ids[8] = 2861236224 > btf_sock_ids[9] = 1251082240 > btf_sock_ids[10] = 1334968320 > btf_sock_ids[11] = 1267859456 > btf_sock_ids[12] = 1318191104 > > If these are populated by resolve_btfids, how could we re-verify that > it's being done properly? Following up on this thought, I used objdump to look at the .BTF_ids section and the "struct sock_common" type, comparing the ids for little- and big-endian vmlinux after processing by resolv_btfids. It appears resolv_btfids always writes these ids as little-endian (the build host) and ignores the target endianness. Do I read this correctly? $ objdump --syms -s -j .BTF_ids build_dir/target-mipsel_24kc_musl/linux-malta_le/linux-5.10.41/vmlinux build_dir/target-mipsel_24kc_musl/linux-malta_le/linux-5.10.41/vmlinux: file format elf32-little SYMBOL TABLE: ... 80bbee08 l O .BTF_ids 00000004 __BTF_ID__struct__sock_common__1116 Contents of section .BTF_ids: ... 80bbee08 b40f0000 bc8a0000 b08b0000 50920000 ............P... $ objdump --syms -s -j .BTF_ids build_dir/target-mips_24kc_musl/linux-malta_be/linux-5.10.41/vmlinux build_dir/target-mips_24kc_musl/linux-malta_be/linux-5.10.41/vmlinux: file format elf32-big SYMBOL TABLE: ... 80bc3ebc l O .BTF_ids 00000004 __BTF_ID__struct__sock_common__1116 Contents of section .BTF_ids: ... 80bc3ebc b50f0000 c28a0000 b68b0000 56920000 ............V... I also want to highlight a discrepency I reported earlier between btf ids for "sock_common" reported by resolv_btfids (4021) and bpftool (4015). Why do you suppose this is? > > > > > > Thanks, > > > Tony > > > > > >>> > > >>> Has this been seen before? How best to debug this further or resolve? > > >>> What other details would be useful for BPF kernel developers? > > >>> > > >>> Thanks for any help, > > >>> Tony > > >>> > > >>> [1]: > > >>> (Host details) > > >>> kodidev:~/openwrt-project$ ./staging_dir/host/bin/pahole --version > > >>> v1.21 > > >>> > > >>> (Target details) > > >>> root@OpenWrt:/# uname -a > > >>> Linux OpenWrt 5.10.41 #0 SMP Tue Jun 1 00:54:31 2021 mips GNU/Linux > > >>> > > >>> root@OpenWrt:~# sysctl net.core.bpf_jit_enable=0; ./test_verifier 826 828 > > >>> net.core.bpf_jit_enable = 0 > > >>> > > >>> #826/p reference tracking: branch tracking valid pointer null comparison OK > > >>> #827/p reference tracking: branch tracking valid pointer value comparison OK > > >>> CPU 0 Unable to handle kernel paging request at virtual address > > >>> 00000000, epc == 80244654, ra == 80244654 > > >>> Oops[#1]: > > >>> CPU: 0 PID: 16274 Comm: test_verifier Not tainted 5.10.41 #0 > > >>> $ 0 : 00000000 00000001 00000000 0000a8a2 > > >>> $ 4 : 835ac580 a6280000 00000000 00000001 > > >>> $ 8 : 835ac580 a6280000 00000000 02020202 > > >>> $12 : 8348de58 834ba800 00000000 00000000 > > >>> $16 : 835ac580 8098be2c fffffff3 834bdb38 > > >>> $20 : 8098be0c 00000001 00000018 00000000 > > >>> $24 : 00000000 01415415 > > >>> $28 : 834bc000 834bdac8 00000005 80244654 > > >>> Hi : 00000017 > > >>> Lo : 0a3d70a2 > > >>> epc : 80244654 kernel_type_name+0x20/0x38 > > >>> ra : 80244654 kernel_type_name+0x20/0x38 > > >>> Status: 1000a403 KERNEL EXL IE > > >>> Cause : 00800008 (ExcCode 02) > > >>> BadVA : 00000000 > > >>> PrId : 00019300 (MIPS 24Kc) > > >>> Modules linked in: pppoe ppp_async pppox ppp_generic mac80211_hwsim > > >>> mac80211 iptable_nat ipt_REJECT cfg80211 xt_time xt_tcpudp xt_tcpmss > > >>> xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac > > >>> xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack > > >>> xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT > > >>> xt_MASQUERADE xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY > > >>> slhc sch_mqprio sch_cake pcnet32 nf_reject_ipv4 nf_nat nf_log_ipv4 > > >>> nf_flow_table nf_conntrack_netlink nf_conncount iptable_raw > > >>> iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat > > >>> cls_flower act_vlan pktgen sch_teql sch_sfq sch_red sch_prio sch_pie > > >>> sch_multiq sch_gred sch_fq sch_dsmark sch_codel em_text em_nbyte > > >>> em_meta em_cmp act_simple act_police act_pedit act_ipt act_csum > > >>> libcrc32c em_ipset cls_bpf act_bpf act_ctinfo act_connmark > > >>> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sch_tbf sch_ingress sch_htb > > >>> sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw > > >>> cls_flow cls_basic act_skbedit act_mirred act_gact xt_set > > >>> ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport > > >>> ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net > > >>> ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip > > >>> ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip > > >>> ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set > > >>> nfnetlink nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter > > >>> ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb dummy netlink_diag > > >>> mii > > >>> Process test_verifier (pid: 16274, threadinfo=c1418596, task=05765195, > > >>> tls=77e5aec8) > > >>> Stack : 83428000 83428000 8098be2c 00000000 83428000 8024af78 834bacdc 834bb000 > > >>> a98a0000 834e2580 834e2c00 00000000 834e2c00 8023da9c 834bb070 00000013 > > >>> 80925164 80924f44 00000000 80925164 00000000 83428140 80bc3864 834bb070 > > >>> 834e2c00 00000000 00000010 802c441c 00000000 00000000 00000000 00000000 > > >>> 00000000 00000000 00000000 00000000 00000000 00000056 00000000 00000000 > > >>> ... > > >>> Call Trace: > > >>> [<80244654>] kernel_type_name+0x20/0x38 > > >>> [<8024af78>] check_helper_call+0x1c9c/0x1dbc > > >>> [<8024d008>] do_check_common+0x1f70/0x2a3c > > >>> [<8024fb6c>] bpf_check+0x18f8/0x2308 > > >>> [<802369ec>] bpf_prog_load+0x378/0x860 > > >>> [<80237e1c>] __do_sys_bpf+0x3e0/0x2100 > > >>> [<801142d8>] syscall_common+0x34/0x58 > > >>> > > >>> Code: afbf0014 0c099b58 02002025 <8c450000> 8fbf0014 02002025 > > >>> 8fb00010 08099b4f 27bd0018 > > >>> > > >>> ---[ end trace ab13ac5f89eb825b ]--- > > >>> Kernel panic - not syncing: Fatal exception > > >>> Rebooting in 3 seconds.. > > >>> QEMU: Terminated > > >>> > > >>> > > >>> [2]: > > >>> Function Code: > > >>> ============== > > >>> const char *kernel_type_name(u32 id) > > >>> { > > >>> return btf_name_by_offset(btf_vmlinux, > > >>> btf_type_by_id(btf_vmlinux, id)->name_off); > > >>> } > > >>> > > >>> const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id) > > >>> { > > >>> if (type_id > btf->nr_types) > > >>> return NULL; > > >>> > > >>> return btf->types[type_id]; > > >>> } > > >>> > > >>> Disassembled Code: > > >>> ================== > > >>> 0x0000000000000000: AF BF 00 14 sw $ra, 0x14($sp) > > >>> 0x0000000000000004: 0C 09 9B 58 jal btf_type_by_id > > >>> 0x0000000000000008: 02 00 20 25 move $a0, $s0 > > >>> 0x000000000000000c: 8C 45 00 00 lw $a1, ($v0) <-- NPE > > >>> 0x0000000000000010: 8F BF 00 14 lw $ra, 0x14($sp) > > >>> 0x0000000000000014: 02 00 20 25 move $a0, $s0 > > >>> 0x0000000000000018: 8F B0 00 10 lw $s0, 0x10($sp) > > >>> 0x000000000000001c: 08 09 9B 4F j btf_name_by_offset > > >>> 0x0000000000000020: 27 BD 00 18 addiu $sp, $sp, 0x18 > > >>>