Re: Problems with pointer offsets on ARM32

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 14, 2020 at 12:55 AM Luka Oreskovic
<luka.oreskovic@xxxxxxxxxx> wrote:
>
> On Fri, Sep 11, 2020 at 8:14 PM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Fri, Sep 11, 2020 at 9:45 AM Luka Oreskovic
> > <luka.oreskovic@xxxxxxxxxx> wrote:
> > >
> > > Greetings everyone,
> > >
> > > I have been testing various BPF programs on the ARM32 architecture and
> > > have encountered a strange error.
> > >
> > > When trying to run a simple program that prints out the arguments of
> > > the open syscall,
> > > I found some strange behaviour with the pointer offsets when accessing
> > > the arguments:
> > > The output of llvm-objdump differed from the verifier error dump log.
> > > Notice the differences in lines 0 and 1. Why is the bytecode being
> > > altered at runtime?
> > >
> > > I attached the program, the llvm-objdump result and the verifier dump below.
> > >
> > > Best wishes,
> > > Luka Oreskovic
> > >
> > > BPF program
> > > --------------------------------------------
> > > #include "vmlinux.h"
> > > #include <bpf/bpf_helpers.h>
> > >
> > > SEC("tracepoint/syscalls/sys_enter_open")
> > > int tracepoint__syscalls__sys_enter_open(struct trace_event_raw_sys_enter* ctx)
> > > {
> > >         const char *arg1 = (const char *)ctx->args[0];
> > >         int arg2 = (int)ctx->args[1];
> > >
> > >         bpf_printk("Open arg 1: %s\n", arg1);
> > >         bpf_printk("Open arg 2: %d\n", arg2);
> > >
> > >         return 0;
> > > }
> > >
> > > char LICENSE[] SEC("license") = "GPL";
> > >
> > >
> > > llvm-objdump of program
> > > --------------------------------------------
> > > Disassembly of section tracepoint/syscalls/sys_enter_open:
> > >
> > > 0000000000000000 tracepoint__syscalls__sys_enter_open:
> > > ;       int arg2 = (int)ctx->args[1];
> > >        0:       79 16 18 00 00 00 00 00 r6 = *(u64 *)(r1 + 24)
> > > ;       const char *arg1 = (const char *)ctx->args[0];
> > >        1:       79 13 10 00 00 00 00 00 r3 = *(u64 *)(r1 + 16)
> > >        2:       18 01 00 00 20 31 3a 20 00 00 00 00 25 73 0a 00 r1 =
> > > 2941353058775328 ll
> > > ;       bpf_printk("Open arg 1: %s\n", arg1);
> > >        4:       7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1
> > >        5:       18 07 00 00 4f 70 65 6e 00 00 00 00 20 61 72 67 r7 =
> > > 7454127125170581583 ll
> > >        7:       7b 7a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r7
> > >        8:       bf a1 00 00 00 00 00 00 r1 = r10
> > >        9:       07 01 00 00 f0 ff ff ff r1 += -16
> > >       10:       b7 02 00 00 10 00 00 00 r2 = 16
> > >       11:       85 00 00 00 06 00 00 00 call 6
> > >       12:       18 01 00 00 20 32 3a 20 00 00 00 00 25 64 0a 00 r1 =
> > > 2924860384358944 ll
> > > ;       bpf_printk("Open arg 2: %d\n", arg2);
> > >       14:       7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1
> > >       15:       7b 7a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r7
> > >       16:       bf a1 00 00 00 00 00 00 r1 = r10
> > >       17:       07 01 00 00 f0 ff ff ff r1 += -16
> > >       18:       b7 02 00 00 10 00 00 00 r2 = 16
> > >       19:       bf 63 00 00 00 00 00 00 r3 = r6
> > >       20:       85 00 00 00 06 00 00 00 call 6
> > > ;       return 0;
> > >       21:       b7 00 00 00 00 00 00 00 r0 = 0
> > >       22:       95 00 00 00 00 00 00 00 exit
> > >
> > >
> > > verifier output when running program
> > > --------------------------------------------
> > > libbpf: -- BEGIN DUMP LOG ---
> > > libbpf:
> > > Unrecognized arg#0 type PTR
> > > ; int arg2 = (int)ctx->args[1];
> > > 0: (79) r6 = *(u64 *)(r1 +16)
> > > ; const char *arg1 = (const char *)ctx->args[0];
> > > 1: (79) r3 = *(u64 *)(r1 +12)
> > > invalid bpf_context access off=12 size=8
> > > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > > peak_states 0 mark_read 0
> > >
> > > libbpf: -- END LOG --
> >
> >
> > One suspect would be libbpf's CO-RE relocations. Can you send full
> > debug libbpf logs, it will have a full log of what libbpf adjusted.
> > Please also include the definition of struct trace_event_raw_sys_enter
> > from your vmlinux.h, as well as commit that your kernel was built from
> > (to check the original definition).
>
>
> Here is the data you requested. I can see the reallocations done by BPF CO-RE,
> but I don't understand why they would have to be done in the first place
> since I am using the vmlinux.h that has been generated using the
> devices vmlinux.
> Even if it made sense to change the pointer offsets, they shouldn't
> break the program.

Ok, there are few things at work here. I'll try to explain it and
let's see how we can fix this. This is all due to 32-bit architecture,
of course.

1. BPF "architecture" is strictly 64-bit one. This means that from the
BPF program point of view pointers are 64-bit values, long is 64-bit
as well.
2. look at struct trace_event_raw_sys_enter below: id and each element
of args[] array are longs. What this means is the following:
  - from BPF side of view those are assumed to be 64-bit integers
  - but in reality in ARM32 kernel and in memory they are 32-bit integers.

So the memory view of struct trace_event_raw_sys_enter differs quite a
lot, depending in which context that struct definition is used. You
can see that from the compiled BPF assembly output above and compare
to the actual memory layout, taking into account that long is 4 bytes.
For BPF args[0] is at 16 bytes, args[1] is at 24 bytes. But in reality
it's at 12 byte offset for args[0] and 16 byte offset for args[1]. So
what libbpf/CO-RE is doing is right, it actually relocates offset 16
to 12 and 24 to 16. So without CO-RE you'd be accessing completely
wrong offsets and reading garbage.
3. But read size matters as well. When compiling your BPF program
clang sees 8-byte read and emits corresponding BPF assembly
instruction. While in reality it has to be 4-byte read. Then later in
the verifier (see tp_prog_is_valid_access() in
kernel/trace/bpf_trace.c) there is a very simple check that all
offsets are multiples of access size, and 12 % 8 != 0, so the verifier
complains.

It's a bit verbose explanation, but I hope it describes the problem.
Basically, longs in a mixed BPF/32-bit land are error prone (as well
as pointers in kernel data structures, btw).

Now, a short-term fix for your use case would be to copy/paste struct
trace_event_raw_sys_enter and replace all longs with ints. That will
fix access size and offsets.

Longer-term and a more proper fix is probably for libbpf to be smarter
when emitting vmlinux.h and replace 4-byte longs with ints, knowing
that vmlinux.h is going to be used from always-64-bit BPF
architecture. I'll think a bit more about the implications of this,
and if there are no problems with that approach, I'll submit a change
soon enough. Pointers will still be a problem, though, and I'm not
sure how we can solve it yet. Ideally there would be some solution
that would cause Clang to emit 4-byte reads for pointers. I wonder if
anyone else has any ideas around that.

As a separate thing, it might be a good idea to allow 4-byte aligned
8-byte reads in the verifier on 32-bit architectures, given they are
allowed on such architectures. Though for you that would lead to
garbage in the upper 4 bytes of BPF register, which is equally bad.

>
>
> Kernel commit
> --------------------------------------------
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/?h=v5.9-rc3
>
>
> Struct definition from vmlinux.h
> --------------------------------------------
> struct trace_event_raw_sys_enter {
>         struct trace_entry ent;
>         long int id;
>         long unsigned int args[6];
>         char __data[0];
> };
>
>

[...]

> libbpf: collected 0 externs total
> libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0
> libbpf: sec 'tracepoint/syscalls/sys_enter_open': found 2 CO-RE relocations
> libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #0: kind
> <byte_off> (0), spec is [2] struct trace_event_raw_sys_ent)

these logs seem to be truncated, btw

> libbpf: CO-RE relocating [2] struct trace_event_raw_sys_enter: found
> target candidate [4639] struct trace_event_raw_sys_entr
> libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #0: matching
> candidate #0 [4639] struct trace_event_raw_sys_enter.a)

like here it actually should be something like:

struct trace_event_raw_sys_enter.args[0]

> libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #0: patched
> insn #0 (LDX/ST/STX) off 24 -> 16
> libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #1: kind
> <byte_off> (0), spec is [2] struct trace_event_raw_sys_ent)
> libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #1: matching
> candidate #0 [4639] struct trace_event_raw_sys_enter.a)
> libbpf: prog 'tracepoint/syscalls/sys_enter_open': relo #1: patched
> insn #1 (LDX/ST/STX) off 16 -> 12
> libbpf: load bpf program failed: Permission denied
> libbpf: -- BEGIN DUMP LOG ---
> libbpf:
> Unrecognized arg#0 type PTR
> ; int arg2 = (int)ctx->args[1];
> 0: (79) r6 = *(u64 *)(r1 +16)
> ; const char *arg1 = (const char *)ctx->args[0];
> 1: (79) r3 = *(u64 *)(r1 +12)
> invalid bpf_context access off=12 size=8
> processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> peak_states 0 mark_read 0
>
> libbpf: -- END LOG --
> libbpf: failed to load program 'tracepoint/syscalls/sys_enter_open'
> libbpf: failed to load object 'hello_bpf'
> libbpf: failed to load BPF skeleton 'hello_bpf': -4007
> failed to load BPF object -4007



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux