On Fri, Jun 02, 2023 at 10:27:31AM -0700, Andrii Nakryiko wrote: > On Thu, May 25, 2023 at 6:38 PM Jackie Liu <liu.yun@xxxxxxxxx> wrote: > > > > Hi Andrii. > > > > 在 2023/5/26 04:43, Andrii Nakryiko 写道: > > > On Thu, May 25, 2023 at 3:28 AM Jackie Liu <liu.yun@xxxxxxxxx> wrote: > > >> > > >> From: Jackie Liu <liuyun01@xxxxxxxxxx> > > >> > > >> When using regular expression matching with "kprobe multi", it scans all > > >> the functions under "/proc/kallsyms" that can be matched. However, not all > > >> of them can be traced by kprobe.multi. If any one of the functions fails > > >> to be traced, it will result in the failure of all functions. The best > > >> approach is to filter out the functions that cannot be traced to ensure > > >> proper tracking of the functions. > > >> > > >> Use available_filter_functions check first, if failed, fallback to > > >> kallsyms. > > >> > > >> Here is the test eBPF program [1]. > > >> [1] https://github.com/JackieLiu1/ketones/commit/a9e76d1ba57390e533b8b3eadde97f7a4535e867 > > >> > > >> Suggested-by: Jiri Olsa <olsajiri@xxxxxxxxx> > > >> Signed-off-by: Jackie Liu <liuyun01@xxxxxxxxxx> > > >> --- > > >> tools/lib/bpf/libbpf.c | 92 +++++++++++++++++++++++++++++++++++++----- > > >> 1 file changed, 83 insertions(+), 9 deletions(-) > > >> > > > > > > Question to you and Jiri: what happens when multi-kprobe's syms has > > > duplicates? Will the program be attached multiple times? If yes, then > > > it sounds like a problem? Both available_filters and kallsyms can have > > > duplicate function names in them, right? > > > > If I understand correctly, there should be no problem with repeated > > function registration, because the bottom layer is done through fprobe > > registration addrs, kprobe.multi itself does not do this work, but > > fprobe is based on ftrace, it will register addr by makes a hash, > > that is, if it is the same address, it should be filtered out. > > > > Looking at kernel code, it seems kernel will actually return error if > user specifies multiple duplicated names. Because kernel will > bsearch() to the first instance, and never resolve the second > duplicated instance. And then will assume that not all symbols are > resolved. right, as I wrote in here [1] it will fail [1] https://lore.kernel.org/bpf/ZHB0xNEbjmwHv18d@krava/ > > So, it worries me that we'll switch from kallsyms to available_filters > by default, because that introduces new failure modes. we did not care about duplicate with kallsyms because we used addresses, and I think with duplicate addresss the kprobe_multi link will probably attach (need to check) while with duplicate symbols it won't.. perhaps we could make sure we don't pass duplicate symbols? we do the kprobe_multi bench with symbol names read from available_filter_functions and we filter out duplicates jirka > > Either way, let's add a selftest that uses a duplicate function name > and see what happens? > > > The main problem here is not the problem of repeated registration of > > functions, but some functions are not allowed to hook. For example, when > > I track vfs_*, vfs_set_acl_prepare_kgid and vfs_set_acl_prepare_kuid are > > not allowed to hook. These exist under kallsyms, but > > available_filter_functions does not, I have observed for a while, > > matching through available_filter_functions can effectively prevent this > > from happening. > > Yeah, I understand that. My point above is that a) > available_filter_functions contains duplicates and b) doesn't contain > addresses. So we are forced to rely on kernel string -> addr > resolution, which doesn't seem to handle duplicate entries well (let's > test). > > So it's a regression to switch to that without taking any other precautions. > > > > > > > > >> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > > >> index ad1ec893b41b..3dd72d69cdf7 100644 > > >> --- a/tools/lib/bpf/libbpf.c > > >> +++ b/tools/lib/bpf/libbpf.c > > >> @@ -10417,13 +10417,14 @@ static bool glob_match(const char *str, const char *pat) > > >> struct kprobe_multi_resolve { > > >> const char *pattern; > > >> unsigned long *addrs; > > >> + const char **syms; > > >> size_t cap; > > >> size_t cnt; > > >> }; > > >> > > [...]