On Tue, Feb 21, 2023 at 1:37 PM Daniel Müller <deso@xxxxxxxxxx> wrote: > > On Fri, Feb 17, 2023 at 04:32:05PM -0800, Andrii Nakryiko wrote: > > On Fri, Feb 17, 2023 at 11:19 AM Daniel Müller <deso@xxxxxxxxxx> wrote: > > > > > > This change adds support for attaching uprobes to shared objects located > > > in APKs, which is relevant for Android systems where various libraries > > > > Is there a good link with description of APK that we can record > > somewhere in the comments for future us? > > Perhaps > https://en.wikipedia.org/w/index.php?title=Apk_(file_format)&oldid=1139099120#Package_contents. > > Will add it. > > > Also, does .apk contains only shared libraries, or it could be also > > just a binary? > > It probably could also be for a binary, judging from applications being > available for download in the form of APKs. > > > > may reside in APKs. To make that happen, we extend the syntax for the > > > "binary path" argument to attach to with that supported by various > > > Android tools: > > > <archive>!/<binary-in-archive> > > > > > > For example: > > > /system/app/test-app/test-app.apk!/lib/arm64-v8a/libc++_shared.so > > > > > > APKs need to be specified via full path, i.e., we do not attempt to > > > resolve mere file names by searching system directories. > > > > mere? > > Yes? I'm just confused what "resolve mere file names" means in this context. Like, which file names are not "mere"? > > > > > > > We cannot currently test this functionality end-to-end in an automated > > > fashion, because it relies on an Android system being present, but there > > > is no support for that in CI. I have tested the functionality manually, > > > by creating a libbpf program containing a uretprobe, attaching it to a > > > function inside a shared object inside an APK, and verifying the sanity > > > of the returned values. > > > > > > Signed-off-by: Daniel Müller <deso@xxxxxxxxxx> > > > --- > > > tools/lib/bpf/libbpf.c | 84 ++++++++++++++++++++++++++++++++++++++++-- > > > 1 file changed, 80 insertions(+), 4 deletions(-) > > > [...] > > > + return -LIBBPF_ERRNO__FORMAT; > > > + } > > > + > > > + if (zip_archive_find_entry(archive, file_name, &entry)) { > > > + pr_warn("zip: could not find archive member %s in %s\n", file_name, archive_path); > > > + ret = -LIBBPF_ERRNO__FORMAT; > > > + goto out; > > > + } > > > + > > > + if (entry.compression) { > > > + pr_warn("zip: entry %s of %s is compressed and cannot be handled\n", file_name, > > > + archive_path); > > > + ret = -LIBBPF_ERRNO__FORMAT; > > > + goto out; > > > + } > > > + > > > + elf = elf_memory((void *)entry.data, entry.data_length); > > > + if (!elf) { > > > + pr_warn("elf: could not read elf file %s from %s: %s\n", file_name, archive_path, > > > > I kind of like preserving the "archive/path!/file/path" consistently > > through error messages when referring to file within APK, WDYT? > > It seems valuable to me to make it clear that we "parsed" the string correctly > and split it into the expected parts. it's debatable, if the user doesn't trust libbpf to handle "archive/path!/file/path" spec correctly, then it's too bad. My point here is to keep it consistent with the way that user is specifying it in SEC("") definition > > > > + elf_errmsg(-1)); > > > + ret = -LIBBPF_ERRNO__FORMAT; > > > + goto out; > > > + } > > > + > > > + ret = elf_find_func_offset(elf, file_name, func_name); > > > + if (ret > 0) { > > > + ret += entry.data_offset; > > > + pr_debug("elf: symbol address match for '%s' in '%s': 0x%lx\n", func_name, > > > + archive_path, ret); > > > > so for debugging I feel like we'll want to know both entry.data_offset > > and original ELF offset, let's report all three offset (including the > > final calculated one)? > > I added one more pr_debug() printing the entry offset. The ELF offset is > reported by elf_find_func_offset() and the final offset here. sure, but here we can have all of that conveniently in a single (debug) log message, so why not? > > > > + } > > > + elf_end(elf); > > > + > > > +out: > > > + zip_archive_close(archive); > > > + return ret; > > > +} > > > + > > > static const char *arch_specific_lib_paths(void) > > > { > > > /* > > > @@ -10789,6 +10844,9 @@ bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid, > > > { > > > DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts); > > > char errmsg[STRERR_BUFSIZE], *legacy_probe = NULL; > > > + const char *archive_path = NULL; > > > + const char *archive_sep = NULL; > > > > nit: combine on a single line? > > > > > + char full_archive_path[PATH_MAX]; > > > char full_binary_path[PATH_MAX]; > > > struct bpf_link *link; > > > size_t ref_ctr_off; > > > @@ -10806,9 +10864,21 @@ bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid, > > > if (!binary_path) > > > return libbpf_err_ptr(-EINVAL); > > > > > > - if (!strchr(binary_path, '/')) { > > > - err = resolve_full_path(binary_path, full_binary_path, > > > - sizeof(full_binary_path)); > > > + /* Check if "binary_path" refers to an archive. */ > > > + archive_sep = strstr(binary_path, "!/"); > > > + if (archive_sep) { > > > + if (archive_sep - binary_path >= sizeof(full_archive_path)) { > > > > very unlikely to happen, I wouldn't bother checking, especially that > > strncpy will just truncate and make us fail anyways > > How will it "make us fail"? It will silently truncate the path, no? right, it will be invalid path. But we don't expect this, because we allocated PATH_MAX, so it's only if user goes crazy and makes up some huge invalid path, which never was going to succeed anyways. So I'd drop this check altogether. > > > > + return libbpf_err_ptr(-EINVAL); > > > + } > > > + > > > + strncpy(full_archive_path, binary_path, archive_sep - binary_path); > > > > let's use saner libbpf_strlcpy() instead of strncpy, we stopped using > > strncpy relatively recently > > Okay. > [...]