On Fri, 2023-12-08 at 12:27 +0000, Alan Maguire wrote: [...] > Sorry if this is a digression, but I'm trying to understand how > this might intersect with vmlinux.h's > > #pragma clang attribute push (__attribute__((preserve_access_index)), > apply_to = record > > Since that is currently applied to all structures in vmlinux.h, does > that protect us from the above scenario when BPF code is compiled and > #include's vmlinux.h (I suspect not from what you say below but just > wanted to check)? I realize we get extra relocation info that we don't > need since the offsets for these BPF context structures are recalcuated > by the verifier, but given that clang needs to record the relocations, > does it also constrain the generated code to avoid these "increment > pointer, use zero offset" instruction patterns? Or can they still occur > with preserve_access_index applied to the structure? Sorry, might be a > naive question but it's not clear to me how (if at all) the mechanisms > might interact. Unfortunately preserve_access_index does not save us from this problem. This is the case because field reads and writes are split as two LLVM IR instructions: getelementptr to get an address, and load/store to/from that address. The preserve_access_index transformation rewrites the getelementptr but does not touch load/store. For example, consider the following C code: /* #define __ctx __attribute__((preserve_static_offset)) */ /* #define __pai */ #define __ctx #define __pai __attribute__((preserve_access_index)) extern int magic2(int); struct bpf_sock { int bound_dev_if; int family; } __ctx __pai; struct bpf_sockopt { int _; struct bpf_sock *sk; int level; int optlen; } __ctx __pai; int known_load_sink_example_1(struct bpf_sockopt *ctx) { unsigned g = 0; switch (ctx->level) { case 10: g = magic2(ctx->sk->family); break; case 20: g = magic2(ctx->optlen); break; } return g % 2; } Here is how it is compiled: $ clang -g -O2 --target=bpf -mcpu=v3 -c e3.c -o - | llvm-objdump --no-show-raw-insn -Sdr - ... 0000000000000000 <known_load_sink_example_1>: ; switch (ctx->level) { 0: r2 = *(u32 *)(r1 + 0x10) 0000000000000000: CO-RE <byte_off> [2] struct bpf_sockopt::level (0:2) 1: if w2 == 0x14 goto +0x5 <LBB0_3> 2: w0 = 0x0 ; switch (ctx->level) { 3: if w2 != 0xa goto +0x8 <LBB0_5> ; g = magic2(ctx->sk->family); 4: r1 = *(u64 *)(r1 + 0x8) 0000000000000020: CO-RE <byte_off> [2] struct bpf_sockopt::sk (0:1) 5: r2 = 0x4 0000000000000028: CO-RE <byte_off> [7] struct bpf_sock::family (0:1) 6: goto +0x1 <LBB0_4> 0000000000000038 <LBB0_3>: 7: r2 = 0x14 0000000000000038: CO-RE <byte_off> [2] struct bpf_sockopt::optlen (0:3) 0000000000000040 <LBB0_4>: 8: r1 += r2 9: r1 = *(u32 *)(r1 + 0x0) <---------------- verifier error would 10: call -0x1 be reported for this insn 0000000000000050: R_BPF_64_32 magic2 ; return g % 2; 11: w0 &= 0x1 0000000000000060 <LBB0_5>: 12: exit > The reason I ask is if it was safe to assume that code generation would > avoid such patterns with preserve_access_index, it might avoid needing > to update vmlinux.h generation. In current LLVM implementation preserve_static_offset has priority over preserve_access_index. So changing __pai/__ctx definitions above helps. (And this priority of one attribute over the other was the reason to have preserve_static_offset as an attribute, not as btf_decl_tag("preserve_static_offset"). Although that is unfortunate for vmlinux.h, as we already have means to preserve decl tags). [...] > > How to add the same definitions in vmlinux.h is an open question, > > and most likely requires bpftool modification: > > - Hard code generation of __bpf_ctx based on type names? > > - Mark context types with some special > > __attribute__((btf_decl_tag("preserve_static_offset"))) > > and convert it to __attribute__((preserve_static_offset))? > > To me it seems like whatever mechanism supports identification of such > structures would need to live in vmlinux BTF as ideally it should be > possible to generate vmlinux.h purely from that BTF. That seems to argue > for the declaration tag approach. Tbh, I like the decl tag approach a bit more too. Although macro definition would be somewhat ridiculous: #if __has_attribute(preserve_static_offset) && defined(__bpf__) #define __bpf_ctx __attribute__((preserve_static_offset)) \ __attribute__((btf_decl_tag("preserve_static_offset"))) #else #define __bpf_ctx #endif Thanks, Eduard