Eduard Zingerman <eddyz87@xxxxxxxxx> writes: > Some time ago, in an off-list discussion, Alexei Starovoitov suggested > compiling certain kfuncs to BPF to allow inlining calls to such kfuncs > during verification. This RFC explores the idea. > > This RFC introduces a notion of inlinable BPF kfuncs. > Inlinable kfuncs are compiled to BPF and are inlined by verifier after > program verification. Inlined kfunc bodies are subject to dead code > removal and removal of conditional jumps, if such jumps are proved to > always follow a single branch. Ohh, this is very exciting! Mostly want to comment on this bit: > Imo, this RFC is worth following through only if number of kfuncs > benefiting from inlining is big. If the set is limited to dynptr > family of functions, it is simpler to add a number of hard-coded > inlining templates for such functions (similarly to what is currently > done for some helpers). One place where this would definitely be applicable is in all the XDP HW metadata kfuncs. Right now, there's a function call for each piece of HW metadata that an XDP program wants to read, which quickly adds up. And in XDP land we are counting function calls, as the overhead (~1.1 ns) is directly measurable in XDP PPS performance. Back when we settled on the kfunc approach to reading metadata, we were discussing this overhead, obviously, and whether we should do the bespoke BPF assembly type inlining that we currently do for map lookups and that sort of thing. We were told that the "right" way to do the inlining is something along the lines of what you are proposing here, so I would very much encourage you to continue working on this! One complication for the XDP kfuncs is that the kfunc that the BPF program calls is actually a stub function in the kernel core; at verification time, the actual function call is replaced with one from the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow supporting this (with kfuncs defined in drivers, i.e., in modules) would be needed for the XDP use case. Happy to help with benchmarking for the XDP use case when/if this can be supported, of course! :) (+Jesper, who I'm sure will be happy to help as well) -Toke