On Wed, 16 Nov 2022 18:41:26 -0800 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > Even with all optimization the performance overhead is not acceptable. > It feels to me that folks are still thinking about bpf trampoline > as a tracing facility. > It's a lot more than that. It needs to run 24/7 with zero overhead. It obviously doesn't have zero overhead. And correctness and maintainability trumps micro-optimizations. > It needs to replace the kernel functions and be invoked What do you mean by "replace the kernel functions"? You mean an existing kernel function can be replaced by a bpf program? Like live patching? This seems rather dangerous, and how does one know that their system has integrity? Is there a feature to sign bpf programs before they can be added? Also, it may be time to bring in the lawyers. If a bpf program can replace an existing kernel function, then it has definitely passed the "user space" exception to the GPL, where user space must use the system call interface. By injecting executable code into the kernel, especially something that replaces kernel functionality, it becomes arguably derived from the kernel itself. And the BPF program must be GPL. Allowing proprietary BPF programs to replace kernel functionality looks like a clear violation and circumvention of the GPL. But I could be mistaken. As I said, it's time to bring in the lawyers on this one. > millions times a second until the system is rebooted. > In this environment every nanosecond counts. > > Even if the fprobe side was completely free the patch 1 has so much > overhead in copy of bpf_cookie, regs, etc that it's a non-starter > for these use cases. > > There are several other fundamental issues in this approach > because of fprobe/ftrace. > It has ftrace_test_recursion_trylock and disables preemption. > Both are deal breakers. Please explain why? The recursion protection lock is a simply bit operation on the task struct which is used to protect against recursion at the same context. Which if you do not have, will likely happen, and the only hint of it is that the system triple faults and reboots. If you are only hooking to one function, then it is easy to figure this out. But with the multi work being done, that is no longer the case. Hooking to functions is *extremely* intrusive. And protection against errors is a must have, and not an option. > > bpf trampoline has to allow recursion in some cases. > See __bpf_prog_enter*() flavors. The recursion lock allows recursions, but not at the same context. That is, interrupt against normal context is fine. But really, you should not have it within the same context. How do you verify that you do not run out of stack? > > bpf trampoline also has to use migrate_disable instead of preemption > and rcu_read_lock() in some cases and rcu_read_lock_trace() in others. > > bpf trampoline must never allocate memory or grab locks. Neither should fprobes / ftrace. -- Steve > > All of these mandatory features exclude fprobe, ftrace, rethook > from possible options. > > Let's figure out how to address concerns with direct calls: