Re: Question: CO-RE-enabled PT_REGS macros give strange results

Timofei Pushkin <pushkin.td@xxxxxxxxx> · Mon, 24 Jul 2023 18:04:43 +0300

On Mon, Jul 24, 2023 at 3:36 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On 24/07/2023 11:32, Timofei Pushkin wrote:
> > Dear BPF community,
> >
> > I'm developing a perf_event BPF program which reads some register
> > values (frame and instruction pointers in particular) from the context
> > provided to it. I found that CO-RE-enabled PT_REGS macros give results
> > different from the results of the usual PT_REGS  macros. I run the
> > program on the same system I compiled it on, and so I cannot
> > understand why the results differ and which ones should I use?
> >
> > From my tests, the results of the usual macros are the correct ones
> > (e.g. I can symbolize the instruction pointers I get this way), but
> > since I try to follow the CO-RE principle, it seems like I should be
> > using the CO-RE-enabled variants instead.
> >
> > I did some experiments and found out that it is the
> > bpf_probe_read_kernel part of the CO-RE-enabled PT_REGS macros that
> > change the results and not __builtin_preserve_access_index. But I
> > still don't get why exactly it changes the results.
> >
>
> Can you provide the exact usage of the BPF CO-RE macros that isn't
> working, and the equivalent non-CO-RE version that is? Also if you

As a minimal example, I wrote the following little BPF program which
prints instruction pointers obtained with non-CO-RE and CO-RE macros:

volatile const pid_t target_pid;

SEC("perf_event")
int do_test(struct bpf_perf_event_data *ctx) {
    pid_t pid = bpf_get_current_pid_tgid();
    if (pid != target_pid) return 0;

    unsigned long p = PT_REGS_IP(&ctx->regs);
    unsigned long p_core = PT_REGS_IP_CORE(&ctx->regs);
    bpf_printk("non-CO-RE: %lx, CO-RE: %lx", p, p_core);

    return 0;
}

>From user space, I set the target PID and attach the program to CPU
clock perf events (error checking and cleanup omitted for brevity):

int main(int argc, char *argv[]) {
    // Load the program also setting the target PID
    struct test_program_bpf *skel = test_program_bpf__open();
    skel->rodata->target_pid = (pid_t) strtol(argv[1], NULL, 10);
    test_program_bpf__load(skel);

    // Attach to perf events
    struct perf_event_attr attr = {
        .type = PERF_TYPE_SOFTWARE,
        .size = sizeof(struct perf_event_attr),
        .config = PERF_COUNT_SW_CPU_CLOCK,
        .sample_freq = 1,
        .freq = true
    };
    for (int cpu_i = 0; cpu_i < libbpf_num_possible_cpus(); cpu_i++) {
        int perf_fd = syscall(SYS_perf_event_open, &attr, -1, cpu_i, -1, 0);
        bpf_program__attach_perf_event(skel->progs.do_test, perf_fd);
    }

    // Wait for Ctrl-C
    pause();
    return 0;
}

As an experiment, I launched a simple C program with an endless loop
in main and started the BPF program above with its target PID set to
the PID of this simple C program. Then by checking the virtual memory
mapped for the C program (with "cat /proc/<PID>/maps"), I found out
that its .text section got mapped into 55ca2577b000-55ca2577c000
address space. When I checked the output of the BPF program, I got
"non-CO-RE: 55ca2577b131, CO-RE: ffffa58810527e48". As you can see,
the non-CO-RE result maps into the .text section of the launched C
program (as it should since this is the value of the instruction
pointer), while the CO-RE result does not.

Alternatively, if I replace PT_REGS_IP and PT_REGS_IP_CORE with the
equivalents for the stack pointer (PT_REGS_SP and PT_REGS_SP_CORE), I
get results that correspond to the stack address space from the
non-CO-RE macro, but I always get 0 from the CO-RE macro.

> can provide details on the platform you're running on that will
> help narrow down the issue. Thanks!

Sure. I'm running Ubuntu 22.04.1, kernel version 5.19.0-46-generic,
the architecture is x86_64, clang 14.0.0 is used to compile BPF
programs with flags -g -O2 -D__TARGET_ARCH_x86.

Thanks,
Timofei

>
> Alan
>
> > Thank you in advance,
> > Timofei
> >