2023-01-03 15:46 UTC-0800 ~ Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> > On Tue, Jan 3, 2023 at 7:03 AM Quentin Monnet <quentin@xxxxxxxxxxxxx> wrote: >> >> 2022-12-20 16:13 UTC-0800 ~ Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> >>> On Tue, Dec 20, 2022 at 3:34 AM Leo Yan <leo.yan@xxxxxxxxxx> wrote: >>>> >>>> On Tue, Dec 20, 2022 at 09:31:14AM +0800, Changbin Du wrote: >>>> >>>> [...] >>>> >>>>>>> Now will print below info: >>>>>>> libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux >>>>>> >>>>>> Recently I encountered the same issue, it could be caused by: >>>>>> either missing to install tool pahole or missing to enable kernel >>>>>> configuration CONFIG_DEBUG_INFO_BTF. >>>>>> >>>>>> Could we give explict info for reasoning failure? Like: >>>>>> >>>>>> "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux, >>>>>> please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building". >>>>>> >>>>> This is vmlinux special information and similar tips are removed from >>>>> patch V2. libbpf is common for all ELFs. >>>> >>>> Okay, I see. Sorry for noise. >>>> >>>>>>> Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory >>>>>> >>>>>> This log is confusing when we can find vmlinux file but without BTF >>>>>> section. Consider to use a separate patch to detect vmlinux not >>>>>> found case and print out "No such file or directory"? >>>>>> >>>>> I think it's already there. If the file doesn't exist, open will fail. >>>> >>>> [...] >>>> >>>>>>> @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, >>>>>>> err = 0; >>>>>>> >>>>>>> if (!btf_data) { >>>>>>> + pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path); >>>>>>> err = -ENOENT; >>>> >>>> btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF >>>> section, therefore, bpftool dumps error string "No such file or >>>> directory". It's confused that actually vmlinux is existed. >>>> >>>> I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any >>>> better choice?) to replace -ENOENT at here, this can avoid bpftool to >>>> outputs "No such file or directory" in this case. >>> >>> The only really meaningful error code would be -ESRCH, which >>> strerror() will translate to "No such process", which is also >>> completely confusing. >>> >>> In general, I always found these strerror() messages extremely >>> unhelpful and confusing. I wonder if we should make an effort to >>> actually emit symbolic names of errors instead (literally, "-ENOENT" >>> in this case). This is all tooling for engineers, I find -ENOENT or >>> -ESRCH much more meaningful as an error message, compared to "No such >>> file" seemingly human-readable interpretation. >>> >>> Quenting, what do you think about the above proposal for bpftool? We >>> can have some libbpf helper internally and do it in libbpf error >>> messages as well and just reuse the logic in bpftool, perhaps? >> >> Apologies for the delay. >> What you're proposing is to replace all messages currently looking like >> this: >> >> $ bpftool prog >> Error: can't get next program: Operation not permitted >> >> by: >> >> $ bpftool prog >> Error: can't get next program: -EPERM >> >> Do I understand correctly? > > yep, that's what I had in mind > >> >> I think the strerror() messages are helpful in some occasions (they >> _are_ more human-friendly to many users), but it's also true that >> they're not always precise. With bpftool, "Invalid argument" is a >> classic when the program doesn't load, and may lead to confusion with >> the args passed to bpftool on the command line. Then there are the other >> corner cases like the one discussed in this thread. So, why not. > > maybe the right approach would be to have both symbolic error name and > its human-readable representation, so for example above > > Error: can't get next program: [-EPERM] Operation not permitted > > or something like that? And if error value is unknown, just keep it as > integer: "[-5555]" ? That would be great, we'd have both the error name for savvy users and the (more or less accurate) interpretation for others. >> If we do change, yeah I'd rather have as much of this handling in libbpf >> itself, and then adjust bpftool to handle the remaining cases, for >> consistency. > > we can teach libbpf_strerror_r() to do this and if bpftool is going to > use it consistently then it would get the benefit automatically Sounds good to me.