On Thu, 2023-07-27 at 20:03 -0700, Yonghong Song wrote: > > On 7/26/23 4:39 PM, Eduard Zingerman wrote: > > On Wed, 2023-07-26 at 23:03 +0300, Eduard Zingerman wrote: > > [...] > > > > > It looks like `PT_REGS_IP_CORE` macro should not be defined through > > > > > bpf_probe_read_kernel(). I'll dig through commit history tomorrow to > > > > > understand why is it defined like that now. > > > > > help > > > > > > > > If I recall the rationale was to allow the macros to work for both > > > > BPF programs that can do direct dereference (fentry, fexit, tp_btf etc) > > > > and for kprobe-style that need to use bpf_probe_read_kernel(). > > > > Not sure if it would be worth having variants that are purely > > > > dereference-based, since we can just use PT_REGS_IP() due to > > > > the __builtin_preserve_access_index attributes applied in vmlinux.h. > > > > > > Sorry, need a bit more time, thanks for the context. > > > > The PT_REGS_*_CORE macros were added by Andrii Nakryiko in [1]. > > Stated intent there is to use those macros for raw tracepoint > > programs. Such programs have `struct pt_regs` as a parameter. > > Contexts of type `struct pt_regs` are *not* subject to rewrite by > > convert_ctx_access(), so it is valid to use PT_REGS_*_CORE for such > > programs. > > > > However, `struct pt_regs` is also a part of `struct > > bpf_perf_event_data`. Latter is used as a context parameter for > > "perf_event" programs and is a subject to rewrite by > > convert_ctx_access(). Thus, PT_REGS_*_CORE macros can't be used for > > such programs (because these macro are implemented through > > bpf_probe_read_kernel() of which convert_ctx_access() is not aware). > > > > If `struct pt_regs` is defined with `preserve_access_index` attribute > > CO-RE relocations are generated for both PT_REGS_IP_CORE and > > PT_REGS_IP invocations. So, there is no real need to use *_CORE > > variants in combination with `struct bpf_perf_event_data` to have all > > CO-RE benefits, e.g.: > > > > $ cat bpf.c > > #include "vmlinux.h" > > // ... > > SEC("perf_event") > > int do_test(struct bpf_perf_event_data *ctx) { > > return PT_REGS_IP(&ctx->regs); > > } > > // ... > > $ llvm-objdump --no-show-raw-insn -rd bpf.o > > ... > > 0000000000000000 <do_test>: > > 0: r0 = *(u64 *)(r1 + 0x80) > > 0000000000000000: CO-RE <byte_off> [11] struct bpf_perf_event_data::regs.ip (0:0:16) > > 1: exit > > > > [1] b8ebce86ffe6 ("libbpf: Provide CO-RE variants of PT_REGS macros") > > > > --- > > > > I think the following should be done: > > - Timofei's code should use PT_REGS_IP and make sure that `struct > > pt_regs` has preserve_access_index annotation (e.g. use vmlinux.h); > > - verifier should be adjusted to report error when > > bpf_probe_read_kernel() (and similar) are used to read from "fake" > > contexts. > > The func prototype of bpf_probe_read_kernel() is > > BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size, > const void *, unsafe_ptr) > { > return bpf_probe_read_kernel_common(dst, size, unsafe_ptr); > } > > Notice the argument name is 'unsafe_ptr'. So there is no checking > in verifier for this argument. Some users may take advantage of this > to initialize the 'dst' with 0 by providing an illegal address. On the one hand yes, but on the other hand the address of context parameter like bpf_perf_event_data is a kind of fake, it does not exist. It would be meaningful to use bpf_probe_read_kernel() for this address only if someone knows the layout of the internal verifier structure `bpf_perf_event_data_kern` and wants to access it. Tbh, this appears to be a "footgun". > > > > - (maybe?) update PT_REGS_*_CORE to use `__builtin_preserve_access_index` > > (to allow usage with `bpf_perf_event_data` context). > > > > [...] > >