On Tue, Sep 28, 2021 at 09:30:23AM +0100, Lorenz Bauer wrote: > On Mon, 27 Sept 2021 at 17:50, Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Mon, Sep 27, 2021 at 05:12:15PM +0100, Lorenz Bauer wrote: > > > On Sat, 25 Sept 2021 at 00:13, Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Thu, Sep 23, 2021 at 12:33:58PM +0100, Lorenz Bauer wrote: > > > > > > > > > > Some questions: > > > > > * How can this handle kernels that don't have built-in BTF? Not a > > > > > problem for myself, but some people have to deal with BTF-less distro > > > > > kernels by using pahole to generate external BTF from debug symbols. > > > > > Can we accommodate that? > > > > > > > > I think so, but it probably should be done as a generic feature: > > > > "populate kernel BTF". > > > > When kernel wasn't compiled with BTF there could be a way to > > > > populate it with such. Just like we do sys_bpf(BTF_LOAD) > > > > for program's BTF we can allow populating vmlinux BTF this way. > > > > Unlike builtin BTF it wouldn't be trusted for certain verifier assumptions, > > > > but better than nothing and more convenient than specifying BTF file > > > > on a side for every bpf prog load with traditional libbpf style. > > > > > > From my POV we already have an API for external BTF (and I think > > > libbpf does too?) but would need a new API for "load kernel BTF". > > > Global state like this also doesn't work well for several individual > > > processes. Imagine multiple programs on the system trying to each > > > replace the kernel BTF, how would that work? Which one wins? > > > > The kernel BTF can be only one, of course. > > I don't expect progs to update the kernel BTF when they start. > > It's more of the admin/chef job when kernel boots. > > Only for the cases when kernel somehow was compiled without BTF. > > > > > Being > > > able to give my own fd for kernel BTF circumvents all those problems > > > and seems much cleaner to me. > > > > You mean to pass kernel BTF's fd to the kernel? > > It's doable, but I don't quite see the operational side of it. > > If progs have to carry both their BTF and kernel BTF why would > > they need CO-RE at all? If they were compiled with given kernel BTF > > there is no need to adjust offsets for the given host. > > I suspect I simply don't understand your use case :) > > This is the "distro ships without BTF" case that the aqua sec folks > have been grappling with, and for which btfhub is a solution. If the > distro disables BTF they are unlikely to perform this "admin" job in > the first place. So whose responsibility is it to load that BTF? > Currently it falls on the developers of the user space tooling to > provide alternative BTF. Only allowing a single replacement BTF makes > this difficult. There is only one BTF that matches the kernel. If one was buggy (due to pahole/compiler issue) it would be replaced with the fixed one. I can see the case where two vmlinux BTFs would be used for testing. Like the kernel compiled with clang produces one BTF and the kernel compiled with gcc->pahole produces another BTF, but the vmlinux would be different too. So the admins/users should be using BTF that matches the kernel. > Here is why: > * Since external BTF is a thing, loaders today have to provide a way > to relocate against external BTF in a non-standard location. This > means loading the file from disk and then performing CO-RE using that > info. > * Users of the loader build a btfhub integration (or similar) and > provide a path to the external BTF during load. They do this because > they will have to support legacy kernels for years to come. > * Under my proposal, a loader can detect whether in-kernel CO-RE is > supported, load the BTF provided by the user into the kernel, and pass > that fd to PROG_LOAD. > * This is transparent to the user: they keep using their existing BTF > but get the benefit of canonical CO-RE resolution. > > We don't have to introduce a new loader-side API to deal with this > situation. We also don't have to deal with a global resource that is > subject to the whims of the distro. I agree with all of the above. It's easy to add 'target_vmlinux_btf_fd' to PROG_LOAD and let CO-RE in the kernel use that, but the kernel has dynamically loaded kernel modules and it does search through them. They will not be supported in such case. I think it's an ok limitation.