Re: [PATCH v2 bpf-next][RFC] libbpf: introduce legacy kprobe events support

Rafael David Tinoco <rafaeldtinoco@xxxxxxxxxx> · Wed, 14 Apr 2021 11:30:42 -0300

> 
>>> So I don't get at all why you have these toggles, especially
>>> ALL_TOGGLE? You shouldn't try to determine the state of another probe.
>>> You always know whether you want to enable or disable your specific
>>> toggle. I'm very confused by all this.
>> 
>> Yes, this was a confusing thing indeed and to be honest it proved to
>> be very buggy when testing with conntracker. What I’ll do (or I’m
>> doing) is to toggle ON to needed files before the probe is added:
>> 
>> static inline int add_kprobe_event_legacy(const char* func_name, bool
>> retprobe)
>> {
>>        int ret = 0;
>> 
>>        ret |= poke_kprobe_events(true, func_name, retprobe);
>>        ret |= toggle_kprobe_event_legacy_all(true);
>>        ret |= toggle_single_kprobe_event_legacy(true, func_name, retprobe);
>> 
>>        return ret;
>> }
>> 
>> 1) /sys/kernel/debug/tracing/kprobe_events => 1
>> 2) /sys/kernel/debug/tracing/events/kprobes/enable => 1
>> 3) /sys/kernel/debug/tracing/events/kprobes/%s/enable => 1
> 
> Ok, hold on. I don't think we should use those /enable files,
> actually. Double-checking what BCC does ([0]) and my local demo app I
> wrote a while ago, we use perf_event_open() to activate kprobe, once
> it is created, and that's all that is necessary.
> 
>  [0] https://github.com/iovisor/bcc/blob/master/src/cc/libbpf.c#L1046

No, they are not needed. Those are enabling ftrace kprobe feature:

trace_events.c:
    event_create_dir()
        trace_create_file("enable") -> 
            ftrace_enable_fops():
            .write = event_enable_write() -> ftrace_event_enable_disable()

And kprobe perf events works fine without playing with them as long as:
/sys/kernel/debug/tracing/kprobe_events is always 1 (should we enable
it by default or consider it is enabled and don’t change its value ?).

>> 
>> Because of /sys/kernel/debug/tracing/events/kprobes/%s/enable. I’m
>> toggling it to OFF before removing the kprobe in kprobe_events, like
>> showed above.
> 
> Alright, see above about enable files, it doesn't seem necessary,
> actually. You use poke_kprobe_events() to add or remove kprobe to the
> kernel. That gives you event_name and its id (from
> /sys/kernel/debug/tracing/events/kprobes/%s/id). You then use that id
> to create perf_event and activate BPF program:

Yes, with a small reservation I just found out: function names might
change because of GCC optimisations.. In my case I found out that:

# cat /proc/kallsyms | grep udp_send_skb
ffffffff8f9e0090 t udp_send_skb.isra.48

udp_send_skb probe was not always working because the function name
was changed. Then I saw BCC had this issue back in 2018 and is
fixing it now:

https://github.com/iovisor/bcc/issues/1754
https://github.com/iovisor/bcc/pull/2930

So I thought I could do the same: check if function name is the same
in /proc/kallsyms or if it has changed and use the changed name if
needed (to add to kprobe_events).

Will include that logic and remove the ‘enables’.

> 
> And that should be it. It doesn't seem like either BCC or my example
> (which I'm sure worked last time) does anything with /enable files and
> I'm sure all that works.

First comment.

> 
> [...]
> 
>>>>>     return bpf_program__attach_kprobe(prog, retprobe, func_name);
>>>>> }
>>>> 
>>>> I’m assuming this is okay based on your saying of detecting a feature
>>>> instead of using the if(x) if(y) approach.
>>>> 
>>>>> @@ -11280,4 +11629,7 @@ void bpf_object__destroy_skeleton(struct
>>>>> bpf_object_skeleton *s)
>>>>>      free(s->maps);
>>>>>      free(s->progs);(),
>>>>>      free(s);
>>>>> +
>>>>> +     remove_kprobe_event_legacy("ip_set_create", false);
>>>>> +     remove_kprobe_event_legacy("ip_set_create", true);
>>>> 
>>>> This is the main issue I wanted to show you before continuing.
>>>> I cannot remove the kprobe event unless the obj is unloaded.
>>>> That is why I have this hard coded here, just because I was
>>>> testing. Any thoughts how to cleanup the kprobes without
>>>> jeopardising the API too much ?
>>> 
>>> cannot as in it doesn't work for whatever reason? Or what do you mean?
>>> 
>>> I see that you had bpf_link__detach_perf_event_legacy calling
>>> remove_kprobe_event_legacy, what didn't work?
>>> 
>> 
>> I’m sorry for not being very clear here. What happens is that, if I
>> try to remove the kprobe_event_legacy() BEFORE:
>> 
>> if (s->progs)
>>        bpf_object__detach_skeleton(s);
>> if (s->obj)
>>        bpf_object__close(*s->obj);
>> 
>> It fails with generic write error on kprobe_events file. I need to
>> remove legacy kprobe AFTER object closure. To workaround this on
>> my project, and to show you this issue, I have come up with:
>> 
>> void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s)
>> {
>>         int i, j;
>>         struct probeleft {
>>                 char *probename;
>>                 bool retprobe;
>>         } probesleft[24];
>> 
>>         for (i = 0, j = 0; i < s->prog_cnt; i++) {
>>                 struct bpf_link **link = s->progs[i].link;
>>                 if ((*link)->legacy.name) {
>>                         memset(&probesleft[j], 0, sizeof(struct probeleft));
>>                         probesleft[j].probename = strdup((*link)->legacy.name);
>>                         probesleft[j].retprobe = (*link)->legacy.retprobe;
>>                         j++;
>>                 }
>>         }
>> 
>>         if (s->progs)
>>                 bpf_object__detach_skeleton(s);
>>         if (s->obj)
>>                 bpf_object__close(*s->obj);
>>         free(s->maps);
>>         free(s->progs);
>>         free(s);
>> 
>>         for (j--; j >= 0; j--) {
>>                 remove_kprobe_event_legacy(probesleft[j].probename, probesleft[j].retprobe);
>>                 free(probesleft[j].probename);
>>         }
>> }
>> 
>> Which, of course, is not what I’m suggesting to the lib, but shows
>> the problem and gives you a better idea on how to solve it not
>> breaking the API.
>> 
> 
> bpf_link__destroy() callback should handle that, no? You'll close perf
> event FD, which will "free up" kprobe and you can do
> poke_kprobe_events(false /*remove */, ...). Or am I still missing
> something?

I could only poke_kprobe_events() to remove the kprobe after
bpf_oject__close(), or I would get an I/O error on kprobe_events.
Not sure if after map destroy or program exit.

-rafaeldtinoco