Re: Problems with pointer offsets on ARM32

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 14, 2020 at 7:49 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Mon, Sep 14, 2020 at 12:55 AM Luka Oreskovic
> <luka.oreskovic@xxxxxxxxxx> wrote:
> >
> > On Fri, Sep 11, 2020 at 8:14 PM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > On Fri, Sep 11, 2020 at 9:45 AM Luka Oreskovic
> > > <luka.oreskovic@xxxxxxxxxx> wrote:
> > > >
> > > > Greetings everyone,
> > > >
> > > > I have been testing various BPF programs on the ARM32 architecture and
> > > > have encountered a strange error.
> > > >
> > > > When trying to run a simple program that prints out the arguments of
> > > > the open syscall,
> > > > I found some strange behaviour with the pointer offsets when accessing
> > > > the arguments:
> > > > The output of llvm-objdump differed from the verifier error dump log.
> > > > Notice the differences in lines 0 and 1. Why is the bytecode being
> > > > altered at runtime?
> > > >
> > > > I attached the program, the llvm-objdump result and the verifier dump below.
> > > >
> > > > Best wishes,
> > > > Luka Oreskovic
> > > >
> > > > BPF program
> > > > --------------------------------------------
> > > > #include "vmlinux.h"
> > > > #include <bpf/bpf_helpers.h>
> > > >
> > > > SEC("tracepoint/syscalls/sys_enter_open")
> > > > int tracepoint__syscalls__sys_enter_open(struct trace_event_raw_sys_enter* ctx)
> > > > {
> > > >         const char *arg1 = (const char *)ctx->args[0];
> > > >         int arg2 = (int)ctx->args[1];
> > > >
> > > >         bpf_printk("Open arg 1: %s\n", arg1);
> > > >         bpf_printk("Open arg 2: %d\n", arg2);
> > > >
> > > >         return 0;
> > > > }
> > > >
> > > > char LICENSE[] SEC("license") = "GPL";
> > > >
> > > >
> > > > llvm-objdump of program
> > > > --------------------------------------------
> > > > Disassembly of section tracepoint/syscalls/sys_enter_open:
> > > >
> > > > 0000000000000000 tracepoint__syscalls__sys_enter_open:
> > > > ;       int arg2 = (int)ctx->args[1];
> > > >        0:       79 16 18 00 00 00 00 00 r6 = *(u64 *)(r1 + 24)
> > > > ;       const char *arg1 = (const char *)ctx->args[0];
> > > >        1:       79 13 10 00 00 00 00 00 r3 = *(u64 *)(r1 + 16)
> > > >        2:       18 01 00 00 20 31 3a 20 00 00 00 00 25 73 0a 00 r1 =
> > > > 2941353058775328 ll
> > > > ;       bpf_printk("Open arg 1: %s\n", arg1);
> > > >        4:       7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1
> > > >        5:       18 07 00 00 4f 70 65 6e 00 00 00 00 20 61 72 67 r7 =
> > > > 7454127125170581583 ll
> > > >        7:       7b 7a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r7
> > > >        8:       bf a1 00 00 00 00 00 00 r1 = r10
> > > >        9:       07 01 00 00 f0 ff ff ff r1 += -16
> > > >       10:       b7 02 00 00 10 00 00 00 r2 = 16
> > > >       11:       85 00 00 00 06 00 00 00 call 6
> > > >       12:       18 01 00 00 20 32 3a 20 00 00 00 00 25 64 0a 00 r1 =
> > > > 2924860384358944 ll
> > > > ;       bpf_printk("Open arg 2: %d\n", arg2);
> > > >       14:       7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1
> > > >       15:       7b 7a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r7
> > > >       16:       bf a1 00 00 00 00 00 00 r1 = r10
> > > >       17:       07 01 00 00 f0 ff ff ff r1 += -16
> > > >       18:       b7 02 00 00 10 00 00 00 r2 = 16
> > > >       19:       bf 63 00 00 00 00 00 00 r3 = r6
> > > >       20:       85 00 00 00 06 00 00 00 call 6
> > > > ;       return 0;
> > > >       21:       b7 00 00 00 00 00 00 00 r0 = 0
> > > >       22:       95 00 00 00 00 00 00 00 exit
> > > >
> > > >
> > > > verifier output when running program
> > > > --------------------------------------------
> > > > libbpf: -- BEGIN DUMP LOG ---
> > > > libbpf:
> > > > Unrecognized arg#0 type PTR
> > > > ; int arg2 = (int)ctx->args[1];
> > > > 0: (79) r6 = *(u64 *)(r1 +16)
> > > > ; const char *arg1 = (const char *)ctx->args[0];
> > > > 1: (79) r3 = *(u64 *)(r1 +12)
> > > > invalid bpf_context access off=12 size=8
> > > > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > > > peak_states 0 mark_read 0
> > > >
> > > > libbpf: -- END LOG --
> > >
> > >
> > > One suspect would be libbpf's CO-RE relocations. Can you send full
> > > debug libbpf logs, it will have a full log of what libbpf adjusted.
> > > Please also include the definition of struct trace_event_raw_sys_enter
> > > from your vmlinux.h, as well as commit that your kernel was built from
> > > (to check the original definition).
> >
> >
> > Here is the data you requested. I can see the reallocations done by BPF CO-RE,
> > but I don't understand why they would have to be done in the first place
> > since I am using the vmlinux.h that has been generated using the
> > devices vmlinux.
> > Even if it made sense to change the pointer offsets, they shouldn't
> > break the program.
>
> Ok, there are few things at work here. I'll try to explain it and
> let's see how we can fix this. This is all due to 32-bit architecture,
> of course.
>
> 1. BPF "architecture" is strictly 64-bit one. This means that from the
> BPF program point of view pointers are 64-bit values, long is 64-bit
> as well.
> 2. look at struct trace_event_raw_sys_enter below: id and each element
> of args[] array are longs. What this means is the following:
>   - from BPF side of view those are assumed to be 64-bit integers
>   - but in reality in ARM32 kernel and in memory they are 32-bit integers.
>
> So the memory view of struct trace_event_raw_sys_enter differs quite a
> lot, depending in which context that struct definition is used. You
> can see that from the compiled BPF assembly output above and compare
> to the actual memory layout, taking into account that long is 4 bytes.
> For BPF args[0] is at 16 bytes, args[1] is at 24 bytes. But in reality
> it's at 12 byte offset for args[0] and 16 byte offset for args[1]. So
> what libbpf/CO-RE is doing is right, it actually relocates offset 16
> to 12 and 24 to 16. So without CO-RE you'd be accessing completely
> wrong offsets and reading garbage.
> 3. But read size matters as well. When compiling your BPF program
> clang sees 8-byte read and emits corresponding BPF assembly
> instruction. While in reality it has to be 4-byte read. Then later in
> the verifier (see tp_prog_is_valid_access() in
> kernel/trace/bpf_trace.c) there is a very simple check that all
> offsets are multiples of access size, and 12 % 8 != 0, so the verifier
> complains.
>
> It's a bit verbose explanation, but I hope it describes the problem.
> Basically, longs in a mixed BPF/32-bit land are error prone (as well
> as pointers in kernel data structures, btw).
>
> Now, a short-term fix for your use case would be to copy/paste struct
> trace_event_raw_sys_enter and replace all longs with ints. That will
> fix access size and offsets.
>
> Longer-term and a more proper fix is probably for libbpf to be smarter
> when emitting vmlinux.h and replace 4-byte longs with ints, knowing
> that vmlinux.h is going to be used from always-64-bit BPF
> architecture. I'll think a bit more about the implications of this,
> and if there are no problems with that approach, I'll submit a change
> soon enough. Pointers will still be a problem, though, and I'm not
> sure how we can solve it yet. Ideally there would be some solution
> that would cause Clang to emit 4-byte reads for pointers. I wonder if
> anyone else has any ideas around that.
>
> As a separate thing, it might be a good idea to allow 4-byte aligned
> 8-byte reads in the verifier on 32-bit architectures, given they are
> allowed on such architectures. Though for you that would lead to
> garbage in the upper 4 bytes of BPF register, which is equally bad.
>
> >
> >
> > Kernel commit
> > --------------------------------------------
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/?h=v5.9-rc3
> >
> >
> > Struct definition from vmlinux.h
> > --------------------------------------------
> > struct trace_event_raw_sys_enter {
> >         struct trace_entry ent;
> >         long int id;
> >         long unsigned int args[6];
> >         char __data[0];
> > };
> >
> >
>
> [...]
>
> > libbpf: collected 0 externs total
> > libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
> > libbpf: sec 'tracepoint/syscalls/sys_enter_open': found 2 CO-RE relocations
> > libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #0: kind
> > <byte_off> (0), spec is [2] struct trace_event_raw_sys_ent)
>
> these logs seem to be truncated, btw
>
> > libbpf: CO-RE relocating [2] struct trace_event_raw_sys_enter: found
> > target candidate [4639] struct trace_event_raw_sys_entr
> > libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #0: matching
> > candidate #0 [4639] struct trace_event_raw_sys_enter.a)
>
> like here it actually should be something like:
>
> struct trace_event_raw_sys_enter.args[0]
>
> > libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #0: patched
> > insn #0 (LDX/ST/STX) off 24 -> 16
> > libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #1: kind
> > <byte_off> (0), spec is [2] struct trace_event_raw_sys_ent)
> > libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #1: matching
> > candidate #0 [4639] struct trace_event_raw_sys_enter.a)
> > libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #1: patched
> > insn #1 (LDX/ST/STX) off 16 -> 12
> > libbpf: load bpf program failed: Permission denied
> > libbpf: -- BEGIN DUMP LOG ---
> > libbpf:
> > Unrecognized arg#0 type PTR
> > ; int arg2 = (int)ctx->args[1];
> > 0: (79) r6 = *(u64 *)(r1 +16)
> > ; const char *arg1 = (const char *)ctx->args[0];
> > 1: (79) r3 = *(u64 *)(r1 +12)
> > invalid bpf_context access off=12 size=8
> > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > peak_states 0 mark_read 0
> >
> > libbpf: -- END LOG --
> > libbpf: failed to load program 'tracepoint/syscalls/sys_enter_open'
> > libbpf: failed to load object 'hello_bpf'
> > libbpf: failed to load BPF skeleton 'hello_bpf': -4007
> > failed to load BPF object -4007


Thank you very much for your informative reply. I look forward to the changes.

Best wishes,
Luka Oreskovic



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux