Re: Trying the bpf trace a bpf xdp program

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/4/19 6:58 AM, Yonghong Song wrote:
> 
> 
> On 12/4/19 5:19 AM, Eelco Chaudron wrote:
>>
>>
>> On 2 Dec 2019, at 17:48, Yonghong Song wrote:
>>
>>> On 12/2/19 8:34 AM, Eelco Chaudron wrote:
>>>> On 29 Nov 2019, at 17:52, Yonghong Song wrote:
>>
>> <SNIP>
>>>
>>> You need to build the kernel with
>>>     CONFIG_DEBUG_INFO_BTF=y
>>> Make sure on the build machine you have recent pahole version >= 1.13.
>>
>> With the latest LLVM and CONFIG_DEBUG_INFO_BTF=y the self-test for
>> bpf2bpf is passing!
> 
> Great!
> 
>>
>> However I still have problems with my code, which is getting to the next
>> step, but no my program is killed when trying to load the eBPG fexit
>> code. If I replace my generated eBPF programs for the once generated by
>> the self-test (test_pkt_access.o/fexit_bpf2bpf.o) it works fine.
>>
>>
>> I decided to build my objects just like the example programs (so have a
>> hacked build.sh file) but I get the same results. I.e. being killed by
>> the kernel:
>>
>> bpf(BPF_BTF_LOAD,
>> {btf="\237\353\1\0\30\0\0\0\0\0\0\0\330\0\0\0\330\0\0\0\244\0\0\0\0\0\0\0\0\0\0\2"...,
>> btf_log_buf=NULL, btf_size=404, btf_log_size=0, btf_log_level=0}, 120) = 6
>> bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=208,
>> info=0x7ffdfbdac3b0}}, 120) = 0
>> bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=208,
>> info=0xafb600}}, 120) = 0
>> bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=90}, 120) = 5
>> bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=5, info_len=16,
>> info=0x7ffdfbdac4b0}}, 120) = 0
>> - Opened object file: 0xafb440
>> bpf(BPF_PROG_LOAD, {prog_type=0x1a /* BPF_PROG_TYPE_??? */, insn_cnt=2,
>> insns=0xafbaa0, license="GPL", log_level=7, log_size=16777215,
>> log_buf="\237\353\1", kern_version=KERNEL_VERSION(0, 0, 0),
>> prog_flags=0, prog_name="test_main", prog_ifindex=0,
>> expected_attach_type=0x19 /* BPF_??? */, prog_btf_fd=6,
>> func_info_rec_size=8, func_info=0xafb9f0, func_info_cnt=1,
>> line_info_rec_size=16, line_info=0xafba10, line_info_cnt=1, ...}, 120
>> ) = ?
>> +++ killed by SIGKILL +++
>> Killed
>>
>>
>> [79162.619208] BUG: kernel NULL pointer dereference, address:
> 
> This should be a kernel bug. I will take a look at it today.
> 
>> 0000000000000000
>> [79162.619906] #PF: supervisor read access in kernel mode
>> [79162.620582] #PF: error_code(0x0000) - not-present page
>> [79162.621255] PGD 80000001e2409067 P4D 80000001e2409067 PUD 22eba9067
>> PMD 0
>> [79162.621933] Oops: 0000 [#12] SMP PTI
>> [79162.622599] CPU: 5 PID: 3191 Comm: xdp_sample_fent Tainted: G      D
>>            5.4.0+ #3
>> [79162.623274] Hardware name: Red Hat KVM, BIOS
>> 1.11.1-3.module+el8+2529+a9686a4d 04/01/2014
>> [79162.623962] RIP: 0010:bpf_check+0x1648/0x250b
>> [79162.624650] Code: 41 89 c5 0f 88 d1 0a 00 00 41 f6 47 02 01 0f 84 17
>> 0b 00 00 41 83 7f 04 1a 0f 84 0c 0c 00 00 49 8b 47 20 48 63 db 48 8b 40
>> 68 <48> 8b 04 d8 48 8b 40 30 49 89 42 50 49 8b 46 20 4c 89 cf 4c 89 95
>> [79162.626088] RSP: 0018:ffffb5f6c07c3c88 EFLAGS: 00010293
>> [79162.626822] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>> ffffb5f6c07c3c40
>> [79162.627560] RDX: ffffa0a1e6e01818 RSI: 00000000fffffffa RDI:
>> 0000000000000000
>> [79162.628304] RBP: ffffb5f6c07c3d70 R08: 000000000000000e R09:
>> ffffa0a1f5c9dc90
>> [79162.629053] R10: ffffa0a1f5c9dc80 R11: ffffa0a1e6e0199a R12:
>> ffffa0a1eac48000
>> [79162.629806] R13: 0000000000000000 R14: ffffb5f6c043e000 R15:
>> ffffb5f6c033f000
>> [79162.630562] FS:  00007f560c2e3740(0000) GS:ffffa0a1f7940000(0000)
>> knlGS:0000000000000000
>> [79162.631324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [79162.632072] CR2: 0000000000000000 CR3: 00000001e242a005 CR4:
>> 0000000000360ee0
>> [79162.632813] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [79162.633539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [79162.634255] Call Trace:
>> [79162.634974]  ? _cond_resched+0x15/0x30
>> [79162.635686]  ? kmem_cache_alloc_trace+0x162/0x220
>> [79162.636398]  ? selinux_bpf_prog_alloc+0x1f/0x60
>> [79162.637111]  bpf_prog_load+0x3de/0x690
>> [79162.637809]  __do_sys_bpf+0x105/0x1740
>> [79162.638488]  do_syscall_64+0x5b/0x180
>> [79162.639147]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [79162.639792] RIP: 0033:0x7f560c3fe1ad
>> [79162.640415] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa
>> 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab 5c 0c 00 f7 d8 64 89 01 48
>> [79162.641703] RSP: 002b:00007ffdfbdac318 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000141
>> [79162.642363] RAX: ffffffffffffffda RBX: 0000000000afb440 RCX:
>> 00007f560c3fe1ad
>> [79162.643026] RDX: 0000000000000078 RSI: 00007ffdfbdac390 RDI:
>> 0000000000000005
>> [79162.643676] RBP: 00007ffdfbdac330 R08: 0000000000afba70 R09:
>> 00007ffdfbdac390
>> [79162.644310] R10: 0000000000afcf10 R11: 0000000000000202 R12:
>> 0000000000402690
>> [79162.644935] R13: 00007ffdfbdac790 R14: 0000000000000000 R15:
>> 0000000000000000
>> [79162.645559] Modules linked in: ip6t_REJECT nf_reject_ipv6
>> ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat
>> ebtable_broute ip6table_nat ip6table_mangle ip6table_raw
>> ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw
>> iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set
>> nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables
>> iptable_filter intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cirrus drm_kms_helper
>> virtio_net net_failover joydev drm failover i2c_piix4 virtio_balloon
>> pcspkr ip_tables xfs libcrc32c crc32c_intel ata_generic floppy
>> virtio_scsi serio_raw pata_acpi qemu_fw_cfg
>> [79162.649591] CR2: 0000000000000000
>> [79162.650272] ---[ end trace 5119c5364c1e9c83 ]---
>> [79162.650957] RIP: 0010:bpf_check+0x1648/0x250b
>> [79162.651646] Code: 41 89 c5 0f 88 d1 0a 00 00 41 f6 47 02 01 0f 84 17
>> 0b 00 00 41 83 7f 04 1a 0f 84 0c 0c 00 00 49 8b 47 20 48 63 db 48 8b 40
>> 68 <48> 8b 04 d8 48 8b 40 30 49 89 42 50 49 8b 46 20 4c 89 cf 4c 89 95
>> [79162.653081] RSP: 0018:ffffb5f6c072bc88 EFLAGS: 00010293
>> [79162.653807] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>> ffffb5f6c072bc40
>> [79162.654536] RDX: ffffa0a1e76b1418 RSI: 00000000fffffffa RDI:
>> 0000000000000000
>> [79162.655270] RBP: ffffb5f6c072bd70 R08: 000000000000000e R09:
>> ffffa0a1e4d3fa90
>> [79162.655996] R10: ffffa0a1e4d3fa80 R11: ffffa0a1e76b159a R12:
>> ffffa0a1eac7c000
>> [79162.656715] R13: 0000000000000000 R14: ffffb5f6c01e3000 R15:
>> ffffb5f6c015f000
>> [79162.657429] FS:  00007f560c2e3740(0000) GS:ffffa0a1f7940000(0000)
>> knlGS:0000000000000000
>> [79162.658137] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [79162.658826] CR2: 0000000000000000 CR3: 00000001e242a005 CR4:
>> 0000000000360ee0
>> [79162.659515] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [79162.660196] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>>
>>
>> I’ve put my code on GitHub, maybe it’s just something stupid…

Thanks for the test case. This indeed a kernel bug.
The following change fixed the issue:


-bash-4.4$ git diff
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a0482e1c4a77..034ef81f935b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9636,7 +9636,10 @@ static int check_attach_btf_id(struct 
bpf_verifier_env *env)
                                 ret = -EINVAL;
                                 goto out;
                         }
-                       addr = (long) 
tgt_prog->aux->func[subprog]->bpf_func;
+                       if (subprog == 0)
+                               addr = (long) tgt_prog->bpf_func;
+                       else
+                               addr = (long) 
tgt_prog->aux->func[subprog]->bpf_func;
                 } else {
                         addr = kallsyms_lookup_name(tname);
                         if (!addr) {
-bash-4.4$

The reason is for a bpf program without any additional subprogram 
(callees), tgt_prog->aux->func is not populated and is a NULL pointer,
so the access tgt_prog->aux->func[0]->bpf_func will segfault.

With the above change, your test works properly.

Will send a patch to upstream soon.

>>
>> https://github.com/chaudron/bpf2bpf-tracing
>>
>>
>> Cheers,
>>
>> Eelco
>>
>>
>> PS: If I run the latest pahole (v1.15) on the .o files, I get the
>> following libbpf error: “libbpf: Cannot find bpf_func_info for main
>> program sec fexit/xdp_prog_simple. Ignore all bpf_func_info.”
>>




[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux