On Wed, Apr 14, 2021 at 7:30 AM Rafael David Tinoco <rafaeldtinoco@xxxxxxxxxx> wrote: > > > > >>> So I don't get at all why you have these toggles, especially > >>> ALL_TOGGLE? You shouldn't try to determine the state of another probe. > >>> You always know whether you want to enable or disable your specific > >>> toggle. I'm very confused by all this. > >> > >> Yes, this was a confusing thing indeed and to be honest it proved to > >> be very buggy when testing with conntracker. What I’ll do (or I’m > >> doing) is to toggle ON to needed files before the probe is added: > >> > >> static inline int add_kprobe_event_legacy(const char* func_name, bool > >> retprobe) > >> { > >> int ret = 0; > >> > >> ret |= poke_kprobe_events(true, func_name, retprobe); > >> ret |= toggle_kprobe_event_legacy_all(true); > >> ret |= toggle_single_kprobe_event_legacy(true, func_name, retprobe); > >> > >> return ret; > >> } > >> > >> 1) /sys/kernel/debug/tracing/kprobe_events => 1 > >> 2) /sys/kernel/debug/tracing/events/kprobes/enable => 1 > >> 3) /sys/kernel/debug/tracing/events/kprobes/%s/enable => 1 > > > > Ok, hold on. I don't think we should use those /enable files, > > actually. Double-checking what BCC does ([0]) and my local demo app I > > wrote a while ago, we use perf_event_open() to activate kprobe, once > > it is created, and that's all that is necessary. > > > > [0] https://github.com/iovisor/bcc/blob/master/src/cc/libbpf.c#L1046 > > No, they are not needed. Those are enabling ftrace kprobe feature: > > trace_events.c: > event_create_dir() > trace_create_file("enable") -> > ftrace_enable_fops(): > .write = event_enable_write() -> ftrace_event_enable_disable() > > And kprobe perf events works fine without playing with them as long as: > /sys/kernel/debug/tracing/kprobe_events is always 1 (should we enable > it by default or consider it is enabled and don’t change its value ?). I think considering it enabled is the right call, given that's what BCC does. > > >> > >> Because of /sys/kernel/debug/tracing/events/kprobes/%s/enable. I’m > >> toggling it to OFF before removing the kprobe in kprobe_events, like > >> showed above. > > > > Alright, see above about enable files, it doesn't seem necessary, > > actually. You use poke_kprobe_events() to add or remove kprobe to the > > kernel. That gives you event_name and its id (from > > /sys/kernel/debug/tracing/events/kprobes/%s/id). You then use that id > > to create perf_event and activate BPF program: > > Yes, with a small reservation I just found out: function names might > change because of GCC optimisations.. In my case I found out that: > > # cat /proc/kallsyms | grep udp_send_skb > ffffffff8f9e0090 t udp_send_skb.isra.48 > > udp_send_skb probe was not always working because the function name > was changed. Then I saw BCC had this issue back in 2018 and is > fixing it now: > > https://github.com/iovisor/bcc/issues/1754 > https://github.com/iovisor/bcc/pull/2930 > > So I thought I could do the same: check if function name is the same > in /proc/kallsyms or if it has changed and use the changed name if > needed (to add to kprobe_events). > > Will include that logic and remove the ‘enables’. No, please stop adding arbitrary additions. Function renames, .isra optimizations, etc - that's all concerns of higher level, this API should not try to be smart. It should try to attach to exactly the kprobe specified. > > > > > And that should be it. It doesn't seem like either BCC or my example > > (which I'm sure worked last time) does anything with /enable files and > > I'm sure all that works. > > First comment. > > > > > [...] > > > >>>>> return bpf_program__attach_kprobe(prog, retprobe, func_name); > >>>>> } > >>>> > >>>> I’m assuming this is okay based on your saying of detecting a feature > >>>> instead of using the if(x) if(y) approach. > >>>> > >>>>> @@ -11280,4 +11629,7 @@ void bpf_object__destroy_skeleton(struct > >>>>> bpf_object_skeleton *s) > >>>>> free(s->maps); > >>>>> free(s->progs);(), > >>>>> free(s); > >>>>> + > >>>>> + remove_kprobe_event_legacy("ip_set_create", false); > >>>>> + remove_kprobe_event_legacy("ip_set_create", true); > >>>> > >>>> This is the main issue I wanted to show you before continuing. > >>>> I cannot remove the kprobe event unless the obj is unloaded. > >>>> That is why I have this hard coded here, just because I was > >>>> testing. Any thoughts how to cleanup the kprobes without > >>>> jeopardising the API too much ? > >>> > >>> cannot as in it doesn't work for whatever reason? Or what do you mean? > >>> > >>> I see that you had bpf_link__detach_perf_event_legacy calling > >>> remove_kprobe_event_legacy, what didn't work? > >>> > >> > >> I’m sorry for not being very clear here. What happens is that, if I > >> try to remove the kprobe_event_legacy() BEFORE: > >> > >> if (s->progs) > >> bpf_object__detach_skeleton(s); > >> if (s->obj) > >> bpf_object__close(*s->obj); > >> > >> It fails with generic write error on kprobe_events file. I need to > >> remove legacy kprobe AFTER object closure. To workaround this on > >> my project, and to show you this issue, I have come up with: > >> > >> void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s) > >> { > >> int i, j; > >> struct probeleft { > >> char *probename; > >> bool retprobe; > >> } probesleft[24]; > >> > >> for (i = 0, j = 0; i < s->prog_cnt; i++) { > >> struct bpf_link **link = s->progs[i].link; > >> if ((*link)->legacy.name) { > >> memset(&probesleft[j], 0, sizeof(struct probeleft)); > >> probesleft[j].probename = strdup((*link)->legacy.name); > >> probesleft[j].retprobe = (*link)->legacy.retprobe; > >> j++; > >> } > >> } > >> > >> if (s->progs) > >> bpf_object__detach_skeleton(s); > >> if (s->obj) > >> bpf_object__close(*s->obj); > >> free(s->maps); > >> free(s->progs); > >> free(s); > >> > >> for (j--; j >= 0; j--) { > >> remove_kprobe_event_legacy(probesleft[j].probename, probesleft[j].retprobe); > >> free(probesleft[j].probename); > >> } > >> } > >> > >> Which, of course, is not what I’m suggesting to the lib, but shows > >> the problem and gives you a better idea on how to solve it not > >> breaking the API. > >> > > > > bpf_link__destroy() callback should handle that, no? You'll close perf > > event FD, which will "free up" kprobe and you can do > > poke_kprobe_events(false /*remove */, ...). Or am I still missing > > something? > > I could only poke_kprobe_events() to remove the kprobe after > bpf_oject__close(), or I would get an I/O error on kprobe_events. > Not sure if after map destroy or program exit. Did you figure out why? What's causing an error? > > -rafaeldtinoco >