Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes: [...] > Back to your question of how fw2 will get loaded.. I'm thinking the following: > 1. Static linking: > obj = bpf_object__open("rootlet.o", "fw1.o", "fw2.o"); > // libbpf adjusts call offsets and links into single loadable bpf_object > bpf_object__load(obj); > bpf_set_link_xdp_fd() > No kernel changes are necessary to support program chaining via static linking. > > 2. Dynamic linking: > // assuming libxdp.so manages eth0 > rootlet_fd = get_xdp_fd(eth0); > subprog_btf_id = libbpf_find_prog_btf_id("name_of_placeholder", roolet_fd); > // ^ this function is in patch 16/18 of trampoline > attr.attach_prog_fd = roolet_fd; > attr.attach_btf_id = subprog_btf_id; > // pair (prog_fd, btf_id) needs to be specified at load time > obj = bpf_object__open("fw2.o", attr); > bpf_object__load(obj); > prog = bpf_object__find_program_by_title(obj); > link = bpf_program__replace(prog); // similar to bpf_program__attach_trace() > // no extra arguments during 'replace'. > // Target (prog_fd, btf_id) already known to the kernel and verified OK, this makes sense. >> So the two component programs would still exist as kernel objects, >> right? > > yes. Both fw1.o and fw2.o will be loaded and running instead of placeholders. > >> And the trampolines would keep individual stats for each one (if >> BPF stats are enabled)? > > In case of dynamic linking both fw1.o and fw2.o will be seen as individual > programs from 'bpftool p s' point of view. And both will have > individual stats. Right, this is important, and I think it's where my skepticism about static linking comes from. With static linking, each XDP program will be "reduced" to a subprog instead of a full stand-alone program. Which means that its execution will be different depending on whether it is just attached directly to an interface, or if it's been linked with a rootlet before loading. I'll admit I don't know enough about how subprograms actually work to know if it's a *meaningful* difference, so I guess I'll go play around with it. If nothing else, experimenting with static linking can be a way to hash out the semantics until dynamic linking lands. >> Could userspace also extract the prog IDs being >> referenced by the "glue" proglet? > > Not sure I follow. Both fw1.o and fw2.o will have their own prog ids. > fw1_prog->aux->linked_prog == rootlet_prog > fw2_prog->aux->linked_prog == rootlet_prog > Unloading and detaching fw1.o will make kernel to switch back to placeholder > subprog in roolet_prog. I believe roolet_prog should not keep a list of progs > that attached to it (or replaced its subprogs) to avoid circular > dependency. Well I did mean the link in the other direction. But thinking about it some more, I don't think it really matters. The important bit is that userspace can answer the question "given that rootlet ID X is currently attached on eth0, which two program IDs Y and Z will actually run on that interface?". And if there's a link in the other direction, it could just iterate over all loaded programs in the kernel to find them, so that is OK; as long as we can also tell in which "slot" in the rootlet a given program is currently attached. > Due to that detaching roolet_prog from netdev will stop the flow of > packets into fw1.o, but refcnt of rootlet_prog will not go to zero, so > it will stay in memory until both fw1.o and fw2.o detach from > rootlet.o. OK, that is probably fine. I think we should teach most utilities to deal with this anyway; in particular, iproute2 should know about multi-progs (i.e., link against libxdp). >> What about attaching a third program? Would that work by recursion (as >> above, but with the old proglet as old_fd), or should the library build >> a whole new sequence from the component programs? > > This choice is up to libxdp.so. It can have a number of placeholders > ready to be replaced by new progs. Or it can re-generate rootlet.o > every time new fwX.o comes along. Short term I would start development > with auto-generated roolet.o and static linking done by libbpf > while the policy and roolet are done by libxdp.so, since this work > doesn't depend on any kernel changes. Long term auto-generation > can stay in libxdp.so if it turns out to be sufficient. Yes, as I said above this sounds like at least it's a start. >> Finally, what happens if someone where to try to attach a retprobe to >> one of the component programs? Could it be possible to do that even >> while program is being run from proglet dispatch? That way we can still >> debug an individual XDP program even though it's run as part of a chain. > > Right. The fentry/fexit tracing is orthogonal to static/dynamic linking. > It will be available for all prog types after trampoline patches land. > See fexit_bpf2bpf.c example in the last 18/18 patch. > We will be able to debug XDP program regardless whether it's a rootlet > or a subprogram. Doesn't matter whether linking was static or dynamic. OK, that's great, and certainly resolved one point of skepticism :) > With fentry/fexit we will be able to do different stats too. > Right now bpf program stats are limited to cycles and I resisted a lot > of pressure to add more hard coded stats. With fentry/fexit we can > collect arbitrary counters per program. Like number of L1-cache misses > or number of TLB misses in a given XDP prog. Yeah, that makes a lot of sense, of course. Great! >> Sounds reasonable. Any reason libxdp.so couldn't be part of libbpf? > > libxdp.so is a policy specifier while libbpf is a tool. It makes more > sense for them to be separate. libbpf has strong api compatibility > guarantees. While I don't think anyone knows at this point how libxdp > api should look and it will take some time for it to mature. Well, we'd want libxdp to have the same strong API guarantees, eventually. Which would be a reason to just include it in libbpf. But sure, I wasn't suggesting to do this from the get-go; we can start out with something separate and decide later when/if it makes sense to integrate. As long as libbpf can do the heavy lifting on the actual linking that is fine with me. -Toke