> On Wed, Mar 01, 2023 at 11:02:25AM -0800, Andrii Nakryiko wrote: > > > Use libbpf_strerror_r to expand the error when failed to parse the btf > > file at btf_custom_path. It does not change a lot locally, but since the > > error will bubble up through a few layers, it may become quite > > confusing otherwise. As an example here is what happens when the file > > indicated via btf_custom_path does not exist and the caller uses > > strerror as well: > > > > libbpf: failed to parse target BTF: -2 > > libbpf: failed to perform CO-RE relocations: -2 > > libbpf: failed to load object 'bpf_probe' > > libbpf: failed to load BPF skeleton 'bpf_probe': -2 > > [caller]: failed to load BPF object (errno: 2 | message: No such file or directory) > > > > In this context "No such file or directory" could be easily > > misinterpreted as belonging to some other part of loading process, e.g. > > the BPF object itself. With this change it would look a bit better: > > > > libbpf: failed to parse target BTF: No such file or directory > > libbpf: failed to perform CO-RE relocations: -2 > > libbpf: failed to load object 'bpf_probe' > > libbpf: failed to load BPF skeleton 'bpf_probe': -2 > > [caller]: failed to load BPF object (errno: 2 | message: No such file or directory) > > I find these text-only error messages more harmful, actually. Very > often their literal meaning is confusing, and instead the process is > to guess what's -Exxx error they represent, and go from there. > > Recently me and Quentin discussed moving towards an approach where > we'd log both symbolic error value (-EPERM instead of -1) and also > human-readable text message. So I'd prefer us figuring out how to do > this ergonomically in libbpf and bpftool code base, and start moving > in that direction. Fair enough, thanks. I would love to try out any suggestions in this area -- we were recently looking into error handling, and certain parts were suboptimal. Talking about confusing text error messages, I'm curious about -ESRCH usage. It's being used in libbpf and various subsystem as well to indicate that something wasn't found, so I guess it's an established practice. But then in case btf__load_vmlinux_btf can't find a proper file and reports an error, the caller gets surprising "No such process" out of strerror. Am I missing something, is it implemented like this on purpose?