On 11/22/21 8:19 AM, YiFei Zhu wrote:
Hi
I've been investigating the use of BPF CO-RE. I discovered that if I
include vmlinux.h and have all structures annotated with
__attribute__((preserve_access_index)), including the context struct,
then a prog that accesses an array field in the context struct, in
some particular way, cannot pass the verifier.
A bunch of manual reduction plus creduce gives me this output:
struct bpf_sock_ops {
int family;
int remote_ip6[];
} __attribute__((preserve_access_index));
__attribute__((section("sockops"))) int b(struct bpf_sock_ops *d) {
int a = d->family;
int *c = d->remote_ip6;
c[2] = a;
return 0;
}
With Debian clang version 11.1.0-4+build1, this compiles to
0000000000000000 <b>:
0: b7 02 00 00 04 00 00 00 r2 = 4
1: bf 13 00 00 00 00 00 00 r3 = r1
2: 0f 23 00 00 00 00 00 00 r3 += r2
3: 61 11 00 00 00 00 00 00 r1 = *(u32 *)(r1 + 0)
4: 63 13 08 00 00 00 00 00 *(u32 *)(r3 + 8) = r1
5: b7 00 00 00 00 00 00 00 r0 = 0
6: 95 00 00 00 00 00 00 00 exit
And the prog will be rejected with this verifier log:
; __attribute__((section("sockops"))) int b(struct bpf_sock_ops *d) {
0: (b7) r2 = 32
1: (bf) r3 = r1
2: (0f) r3 += r2
last_idx 2 first_idx 0
regs=4 stack=0 before 1: (bf) r3 = r1
regs=4 stack=0 before 0: (b7) r2 = 32
; int a = d->family;
3: (61) r1 = *(u32 *)(r1 +20)
; c[2] = a;
4: (63) *(u32 *)(r3 +8) = r1
dereference of modified ctx ptr R3 off=32 disallowed
processed 5 insns (limit 1000000) max_states_per_insn 0 total_states
0 peak_states 0 mark_read 0
Thanks for reporting the issue. The example you had here exposed an llvm
limitation.
For the following code:
> int *c = d->remote_ip6;
> c[2] = a;
The relocation will apply to d->remote_ip6. And the below code sequence
is for c = d->remote_ip6:
> 0: (b7) r2 = 32
> 1: (bf) r3 = r1
> 2: (0f) r3 += r2
And later c[2] store has the issue as you described above.
Note that llvm does not generate relocation for array access itself.
It needs to be part of access chain like d->remote_ip6[2] to be
relocatable.
Looking at check_ctx_reg() and its callsite at check_mem_access() in
verifier.c, it seems that the verifier really does not like when the
context pointer has an offset, in this case the field offset of
d->remote_ip6.
I thought this is just an issue with array fields, that field offset
relocations may have trouble expressing two field accesses (one struct
member, one array memory). However, further testing reveals that this
is not the case, because if I simplify out the local variables, the
error is gone:
struct bpf_sock_ops {
int family;
int remote_ip6[];
} __attribute__((preserve_access_index));
__attribute__((section("sockops"))) int b(struct bpf_sock_ops *d) {
d->remote_ip6[2] = d->family;
return 0;
}
is compiled to:
0000000000000000 <b>:
0: 61 12 00 00 00 00 00 00 r2 = *(u32 *)(r1 + 0)
1: 63 21 0c 00 00 00 00 00 *(u32 *)(r1 + 12) = r2
2: b7 00 00 00 00 00 00 00 r0 = 0
3: 95 00 00 00 00 00 00 00 exit
and is loaded as:
; d->remote_ip6[2] = d->family;
0: (61) r2 = *(u32 *)(r1 +20)
; d->remote_ip6[2] = d->family;
1: (63) *(u32 *)(r1 +40) = r2
invalid bpf_context access off=40 size=4
I believe this error is because d->remote_ip6 is read-only, that this
modification might be more of a product of creduce, but we can see
that the CO-RE adjected offset of the array element from the context
pointer is correct: 32 to remote_ip6, 8 array index, so total offset
is 40.
In this case, the statement is
d->remote_ip6[2] = d->family;
And the whole "d->remote_ip6[2]" is relocated. So we generate a single
instruction for it:
*(u32 *)(r1 +40) = ...
So the workaround is to have all related field in the statement up to
the load/store operation so we have ONE complete relocation.
Also note that removal of __attribute__((preserve_access_index)) from
the first (miscompiled) program produces exactly the same bytecode as
this new program (with no locals).
What is going on here? Why does the access of an array in context in
this particular way cause it to generate code that would not pass the
verifier? Is it a bug in Clang/LLVM, or is it the verifier being too
strict?
How can we fix this issue? We could generate IR with relocation
information for standalone array operation and later llvm can chain
them together. I will take a further look later for a fix.
Thanks
YiFei Zhu