On 08/12/2023 14:21, Eduard Zingerman wrote: > On Fri, 2023-12-08 at 12:27 +0000, Alan Maguire wrote: > [...] >> Sorry if this is a digression, but I'm trying to understand how >> this might intersect with vmlinux.h's >> >> #pragma clang attribute push (__attribute__((preserve_access_index)), >> apply_to = record >> >> Since that is currently applied to all structures in vmlinux.h, does >> that protect us from the above scenario when BPF code is compiled and >> #include's vmlinux.h (I suspect not from what you say below but just >> wanted to check)? I realize we get extra relocation info that we don't >> need since the offsets for these BPF context structures are recalcuated >> by the verifier, but given that clang needs to record the relocations, >> does it also constrain the generated code to avoid these "increment >> pointer, use zero offset" instruction patterns? Or can they still occur >> with preserve_access_index applied to the structure? Sorry, might be a >> naive question but it's not clear to me how (if at all) the mechanisms >> might interact. > > Unfortunately preserve_access_index does not save us from this problem. > This is the case because field reads and writes are split as two LLVM > IR instructions: getelementptr to get an address, and load/store > to/from that address. The preserve_access_index transformation > rewrites the getelementptr but does not touch load/store. > > For example, consider the following C code: > > /* #define __ctx __attribute__((preserve_static_offset)) */ > /* #define __pai */ > #define __ctx > #define __pai __attribute__((preserve_access_index)) > > extern int magic2(int); > > struct bpf_sock { > int bound_dev_if; > int family; > } __ctx __pai; > > struct bpf_sockopt { > int _; > struct bpf_sock *sk; > int level; > int optlen; > } __ctx __pai; > > int known_load_sink_example_1(struct bpf_sockopt *ctx) > { > unsigned g = 0; > switch (ctx->level) { > case 10: > g = magic2(ctx->sk->family); > break; > case 20: > g = magic2(ctx->optlen); > break; > } > return g % 2; > } > > Here is how it is compiled: > > $ clang -g -O2 --target=bpf -mcpu=v3 -c e3.c -o - | llvm-objdump --no-show-raw-insn -Sdr - > ... > 0000000000000000 <known_load_sink_example_1>: > ; switch (ctx->level) { > 0: r2 = *(u32 *)(r1 + 0x10) > 0000000000000000: CO-RE <byte_off> [2] struct bpf_sockopt::level (0:2) > 1: if w2 == 0x14 goto +0x5 <LBB0_3> > 2: w0 = 0x0 > ; switch (ctx->level) { > 3: if w2 != 0xa goto +0x8 <LBB0_5> > ; g = magic2(ctx->sk->family); > 4: r1 = *(u64 *)(r1 + 0x8) > 0000000000000020: CO-RE <byte_off> [2] struct bpf_sockopt::sk (0:1) > 5: r2 = 0x4 > 0000000000000028: CO-RE <byte_off> [7] struct bpf_sock::family (0:1) > 6: goto +0x1 <LBB0_4> > > 0000000000000038 <LBB0_3>: > 7: r2 = 0x14 > 0000000000000038: CO-RE <byte_off> [2] struct bpf_sockopt::optlen (0:3) > > 0000000000000040 <LBB0_4>: > 8: r1 += r2 > 9: r1 = *(u32 *)(r1 + 0x0) <---------------- verifier error would > 10: call -0x1 be reported for this insn > 0000000000000050: R_BPF_64_32 magic2 > ; return g % 2; > 11: w0 &= 0x1 > > 0000000000000060 <LBB0_5>: > 12: exit > >> The reason I ask is if it was safe to assume that code generation would >> avoid such patterns with preserve_access_index, it might avoid needing >> to update vmlinux.h generation. > > In current LLVM implementation preserve_static_offset has priority > over preserve_access_index. So changing __pai/__ctx definitions above helps. > (And this priority of one attribute over the other was the reason to > have preserve_static_offset as an attribute, not as > btf_decl_tag("preserve_static_offset"). Although that is unfortunate > for vmlinux.h, as we already have means to preserve decl tags). > Thanks for the explanation! > [...] > >>> How to add the same definitions in vmlinux.h is an open question, >>> and most likely requires bpftool modification: >>> - Hard code generation of __bpf_ctx based on type names? >>> - Mark context types with some special >>> __attribute__((btf_decl_tag("preserve_static_offset"))) >>> and convert it to __attribute__((preserve_static_offset))? >> >> To me it seems like whatever mechanism supports identification of such >> structures would need to live in vmlinux BTF as ideally it should be >> possible to generate vmlinux.h purely from that BTF. That seems to argue >> for the declaration tag approach. > > Tbh, I like the decl tag approach a bit more too. > Although macro definition would be somewhat ridiculous: > > #if __has_attribute(preserve_static_offset) && defined(__bpf__) > #define __bpf_ctx __attribute__((preserve_static_offset)) \ > __attribute__((btf_decl_tag("preserve_static_offset"))) > #else > #define __bpf_ctx > #endif > As macro definitions go, that's not that ridiculous ;-) If we add it to vmlinux.h, would be good to have a #ifdef BPF_NO_PRESERVE_STATIC_OFFSET #undef __bpf_ctx #define __bpf_ctx #endif ...too, just in case the user wanted to use CO-RE with any of the types covered. Thanks! Alan