Re: Question: CO-RE-enabled PT_REGS macros give strange results

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/26/23 4:39 PM, Eduard Zingerman wrote:
On Wed, 2023-07-26 at 23:03 +0300, Eduard Zingerman wrote:
[...]
It looks like `PT_REGS_IP_CORE` macro should not be defined through
bpf_probe_read_kernel(). I'll dig through commit history tomorrow to
understand why is it defined like that now.
  help

If I recall the rationale was to allow the macros to work for both
BPF programs that can do direct dereference (fentry, fexit, tp_btf etc)
and for kprobe-style that need to use bpf_probe_read_kernel().
Not sure if it would be worth having variants that are purely
dereference-based, since we can just use PT_REGS_IP() due to
the __builtin_preserve_access_index attributes applied in vmlinux.h.

Sorry, need a bit more time, thanks for the context.

The PT_REGS_*_CORE macros were added by Andrii Nakryiko in [1].
Stated intent there is to use those macros for raw tracepoint
programs. Such programs have `struct pt_regs` as a parameter.
Contexts of type `struct pt_regs` are *not* subject to rewrite by
convert_ctx_access(), so it is valid to use PT_REGS_*_CORE for such
programs.

However, `struct pt_regs` is also a part of `struct
bpf_perf_event_data`. Latter is used as a context parameter for
"perf_event" programs and is a subject to rewrite by
convert_ctx_access(). Thus, PT_REGS_*_CORE macros can't be used for
such programs (because these macro are implemented through
bpf_probe_read_kernel() of which convert_ctx_access() is not aware).

If `struct pt_regs` is defined with `preserve_access_index` attribute
CO-RE relocations are generated for both PT_REGS_IP_CORE and
PT_REGS_IP invocations. So, there is no real need to use *_CORE
variants in combination with `struct bpf_perf_event_data` to have all
CO-RE benefits, e.g.:

   $ cat bpf.c
   #include "vmlinux.h"
   // ...
   SEC("perf_event")
   int do_test(struct bpf_perf_event_data *ctx) {
     return PT_REGS_IP(&ctx->regs);
   }
   // ...
   $ llvm-objdump --no-show-raw-insn -rd bpf.o
   ...
   0000000000000000 <do_test>:
          0: r0 = *(u64 *)(r1 + 0x80)
             0000000000000000:  CO-RE <byte_off> [11] struct bpf_perf_event_data::regs.ip (0:0:16)
          1: exit

[1] b8ebce86ffe6 ("libbpf: Provide CO-RE variants of PT_REGS macros")

---

I think the following should be done:
- Timofei's code should use PT_REGS_IP and make sure that `struct
   pt_regs` has preserve_access_index annotation (e.g. use vmlinux.h);
- verifier should be adjusted to report error when
   bpf_probe_read_kernel() (and similar) are used to read from "fake"
   contexts.

The func prototype of bpf_probe_read_kernel() is

BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size,
           const void *, unsafe_ptr)
{
        return bpf_probe_read_kernel_common(dst, size, unsafe_ptr);
}

Notice the argument name is 'unsafe_ptr'. So there is no checking
in verifier for this argument. Some users may take advantage of this
to initialize the 'dst' with 0 by providing an illegal address.


- (maybe?) update PT_REGS_*_CORE to use `__builtin_preserve_access_index`
   (to allow usage with `bpf_perf_event_data` context).

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux