On Wed, 2025-02-05 at 14:33 -0800, Andrii Nakryiko wrote: > > > > > > > > I see two ways forward for you. Either you can break apart your > > > > BPF > > > > object of ~100 BPF programs into more independent BPF objects > > > > > > (seeing > > > > that programs can be independently loaded/unloaded depending on > > > > configuration, seems like you do have a bunch of logic > > > > > > independence, > > > > right?). I assume shared BPF maps are the biggest reason to > > > > keep > > all > > > > those programs together in one BPF object. To share BPF maps > > > > > > between > > > > multiple BPF objects libbpf provides two complementary > > > > interfaces: > > > > > > > > - bpf_map__reuse_fd() for manual control > > > > - BPF map pinning (could be declarative or manual) > > > > > > > > This way you can ensure that all BPF objects would use the same > > > > BPF > > > > map, where necessary. > > > > I think this approach *could* work but could easily become complex for us because we'd need to track all the dependencies between programs and maps, and anything missed could lead to difficult refcount bugs. Further, splitting into objects incurs some performance and memory cost because bpf_object__load_vmlinux_btf will be called for each object, and there's currently no way to share BTF data across the objects. Having a single BPF object avoids this issue. Potentially, libbpf could cache some BTF data to make lessen the impact. > > > > Alternatively, we can look at this problem as needing libbpf to > > > > > > only > > > > prepare BPF program code (doing all the relocations and stuff > > > > like > > > > that), but then application actually taking care of > > > > > > loading/unloading > > > > BPF program with bpf_prog_load() outside of bpf_object > > > > abstraction. > > > > I've had an almost ready patches splitting bpf_object__load() > > > > into > > > two > > > > steps: bpf_object__prepare() and bpf_object__load() after that. > > > > "prepare" step would create BPF maps, load BTF information, > > > > perform > > > > necessary relocations and arrive at final state of BPF program > > > > code > > > > (which you can get with bpf_program__insns() API), but stopping > > > > > > just > > > > short of actually doing bpf_prog_load() step. > > > > > > > > This seems like it would solve your problem as well. You'd use > > > > > > libbpf > > > > to do all the low-level ELF processing and relocation, but then > > > > > > take > > > > over managing BPF program lifetime. Loading/unloading as you > > > > see > > fit, > > > > including in parallel. > > > > > > > > Is this something that would work for you? > > > > I think this API could work, though I think we would need a few other modifications as well in order to correctly handle program/map dependencies and account for relocations. At a high level, I think we'd need something that includes: 1) A way to associate each BPF program with all the maps it will use (association of struct bpf_program * --> list of struct bpf_map * in some form). This is so that we can load/unload associated maps when we load/unload a program. 2) An API to create a BPF map, in case a new map needs to be loaded after initial startup. 3) An API to allow unloading a map while keeping map->fd reserved. This is important because the fd value is used by BPF program instructions, so without something like this, we'd have to redo the relocation process for any other BPF programs that access this map (and thus reload those programs too). This API could be implemented by dup'ing a placeholder fd. Alternatively, if libbpf could automatically refcount maps across multiple BPF objects to load/unload them on demand, then all of the above work could happen behind the scenes. This would be similar to the other approach you mentioned, but with libbpf doing the refcounting heavy lifting instead of leaving that to each application, thus more robust and elegant. This would mean changing libbpf to (a) synchronize access to some map functions and (b) allowing struct bpf_map * to be shared across BPF objects. Perhaps a concept of a "collection of BPF objects" might allow for this. > > > > > > > > > > > > > > > > This patch set also permits loading BPF programs in > > > > > > > > parallel if > > > > the > > > > > > > > application wishes. We tested parallel loading with > > > > > > > > 200+ BPF > > > > > > programs > > > > > > > > and found the load time dropped from 18 seconds to 5 > > > > > > > > seconds > > > > when > > done > > > > > > > > in parallel on a 6.8 kernel. > > > > > > > > bpf_object is intentionally single-threaded, so I don't think > > > > we'll > > > be > > > > supporting parallel BPF program loading in the paradigm of > > > > > > bpf_object > > > > (but see the bpf_object__prepare() proposal). Even from API > > > > > > > standpoint > > > > this is problematic with logging and log buffers basically > > > > assuming > > > > single-threaded execution of BPF program loading. > > > > > > > > All that could be changed or worked around, but your use case > > > > is > > not > > > > really a typical case, so I'm a bit hesitant at this point. > > > > > > > > > > > > I can understand where you're coming from if no one else has mentioned a use case like this. We can do parallel loading by splitting our programs into BPF objects, but unless the objects are split very evenly, this results in less optimal load time. For example, if 100 programs are split into 2 objects and one object has 80 programs while the other has 20, then the one with 80 programs creates a bottleneck. >