Hi folks, I'd like to propose modular BPF verifier as a discussion topic. === Motivation === A decade of production experience with BPF has shown that the desire for feature availability outpaces the ability to deliver new kernels into the field [0]. Therefore, the idea of modularizing the BPF subsystem into a loadable kernel module (LKM) has started to look appealing, as this would allow loading newer versions of the BPF subsystem onto older versions of the kernel without a reboot. That being said, the BPF subsystem is large and complex. It is not practical to try and solve the entire problem all at once. So the question is: where do we start? Proposal: the verifier, because it is high value and architecturally sympathetic to modularization. **High value**: It is straightforward to reason about functionality delivered through kfuncs, helpers, or maps. If feature A exists, codepath X is taken; else, codepath Y. This is generally not practical with verifier improvements - bugs or limitations there are far more difficult to reason about. Complexity grows sharply when applications support many kernel versions. Maintaining a minimal set of cutting edge verifiers in the field is a value-add in the form of enablement, reliability, and simplicity. **Architecturally sympathetic**: The verifier is architecturally a “pure function” [1]. Pure functions are easy to hot swap, as state transfers are not necessary. Because of the verifier’s current design, large re-architecting will not be necessary for modularization. This means modular verifier is primarily a refactoring project and can lean on the existing test suite, making it a good first target. If successful, a modular verifier gives us experience as well develops a toolkit of techniques that can be applied to the subsystem at large. === Goal === The goal is to refactor the verifier into an LKM with an eye towards forward compatibility. === Design === [[ The following is an rough design based on early research. I expect it to ]] [[ change as I gather feedback and do more prototyping work. Nothing is set ]] [[ in stone. ]] For forward compatibility, the idea is to implement a facade built into each kernel that exposes a stable-enough (non-UAPI) interface such that the verifier can remain portable and “plug in” to the running kernel. While I expect the facade to be necessary, it will not be sufficient. There will eventually be details the facade cannot hide, for example an unavoidable ABI break. To solve for this, I/we [2] will maintain a continuously exported copy of the verifier code in a separate repository. From there we can develop branching, patching, or backport strategies to mitigate breaks. The exact details are TBD and will become more clear as work progresses. On top of delivering newer verifiers to older kernels, the facade opens the door to running the verifier in userspace. If the verifier becomes sufficiently portable, we can implement a userspace facade and plug the verifier in. A possible use case could be integrating the verifier into Clang [3] for tightly integrated verifier feedback. This would address a long running pain point with BPF development. This is a lot easier said than done, so consider this highly speculative. The facade exists as a cooperative mechanism. While it might technically be possible to do a non-cooperative modularization of the verifier through aggressive patching and no kernel changes, it seems unnecessarily complex given the alternative. Completion of the facade does not block deployment - the facade seeks to reduce the chance of stranding older kernels with newer verifier changes. === Footnotes === [0]: Disclaimer: this is not intended to be a criticism of anything - merely to point out the fact that the kernel as a singular delivery vehicle leaves a lot on the table. [1]: Perhaps not in practice today, but deeper integration with the rest of the kernel can probably be cleaned up and abstracted. [2]: It's likely more people will be involved if modular verifier proves to be viable. [3]: https://clang.llvm.org/docs/ClangPlugins.html