Re: [PATCH bpf-next 0/5] Attach a cookie to a tracing program.

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 31 Jan 2022 22:45:53 -0800

On Wed, Jan 26, 2022 at 9:17 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Jan 26, 2022 at 1:48 PM Kui-Feng Lee <kuifeng@xxxxxx> wrote:
> >
> > Allow users to attach a 64-bits cookie to a BPF program when link it
> > to fentry, fexit, or fmod_ret of a function.
> >
> > This changeset includes several major changes.
> >
> >  - Add a new field bpf_cookie to struct raw_tracepoint, so that a user
> >    can attach a cookie to a program.
> >
> >  - Store flags in trampoline frames to provide the flexibility of
> >    storing more values in these frames.
> >
> >  - Store the program ID of the current BPF program in the trampoline
> >    frame.
> >
> >  - The implmentation of bpf_get_attach_cookie() for tracing programs
> >    to read the attached cookie.
>
> flags, prog_id, cookie... I don't follow what's going on here.
>
> cookie is supposed to be per link.
> Doing it for fentry only will be inconvenient for users.
> For existing kprobes there is no good place to store it. iirc.

Hm... are you talking about current kprobes? We already have
bpf_cookie support for them. We store it in the associated perf_event,
it's actually pretty convenient.

> For multi attach kprobes there won't be a good place either.

As Jiri mentioned, for multi-attach kprobes the idea was to keep a
sorted array of ips and associated cookies to do log(N) search in
bpf_get_attach_cookie() helper.

For multi-attach fentry, we could use the same approach if we let
either bpf_link or bpf_prog available to fentry/fexit program at
runtime. Kui-Feng seems to be storing prog_id in BPF trampoline
generated code, I don't think that's the best approach, it would be
probably better to store bpf_link or bpf_prog pointer, whichever is
more convenient. I think we can't attach the same BPF program twice to
the same BPF trampoline, so storing this array of ip -> cookie
mappings in bpf_prog would work (because the mapping is unique), but
might be a more cumbersome. Storing bpf_link is conceptually better,
probably, but I haven't thought through if there are any problems if
we support updating bpf_link's underlying program.

But keep in mind, this patch set is basically an RFC to arrive at the
best approach that will also be compatible with multi-attach
fentry/fexit. Though it seems like we'll first need to establish the
need for bpf_cookie (I thought that was not controversial, my bad),
which is fine, see below.

> I think cookie should be out of band.
> Maybe lets try a resizable map[ip]->cookie and don't add
> 'cookie' arrays to multi-kprobe attach,
> 'cookie' field to kprobe, fentry, and other attach apis.
> Adding 'cookie' to all of them is quite a bit of churn for kernel
> and user space.

We don't need all BPF program types, but anything that's useful for
generic tracing is much more powerful with cookies. We have them for
kprobe, uprobe and perf_event programs already. For multi-attach
kprobe/kretprobe and fentry/fexit they are essentially a must to let
users use those BPF program types to their fullest.

> I think resizable bpf map[u64]->u64 solves this problem.
> Maybe cookie isn't even needed.
> If the bpf prog can have a clean map[bpf_get_func_ip()] that
> doesn't have to be sized up front it will address the need.

You mean for users to maintain their own BPF map and pre-populated it
before attaching their BPF programs? Or did I misunderstand?

Sizing such a BPF map is just one problem.

Figuring out the correct IP to use as a key is another and arguably
bigger hassle. Everyone will be forced to consult kallsyms, even if
their use case is simple and they don't need to parse kallsyms at all.
Normally, for fentry/fexit programs, users don't even need kallsyms,
as vmlinux BTF is used to find BTF ID and that's enough to load and
attach fentry/fexit.

And while for kprobe/fentry programs it's inconvenient but can be done
with a bunch of extra work, for uprobes there are situations where
it's impossible to know IP at all. One very specific example is USDTs
in shared library. If user needs to attach to USDT defined in a shared
library across *any* PID (current and future), it's impossible to do
without BPF cookie support, because each different process will load
shared library at unknown base address, so it's impossible to
calculate upfront any absolute IP to look up by. This is the reason
BCC allows to attach to USDTs in shared library only for one specific
process (i.e., PID has to be specified in such a case).

Or did you propose to maintain such IP -> cookie mapping inside the
kernel during bpf_link creation? I think the key would have to be at
least a (link ID, IP) pair, no? The question is whether it's going to
be more convenient to get link ID and IP for all supported BPF
programs at runtime and whether it will cause the same or more amount
of churn?