On Thu, Jan 9, 2020 at 12:39 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, Jan 8, 2020 at 4:35 PM Martin KaFai Lau <kafai@xxxxxx> wrote: > > > > This series introduces BPF STRUCT_OPS. It is an infra to allow > > implementing some specific kernel's function pointers in BPF. > > The first use case included in this series is to implement > > TCP congestion control algorithm in BPF (i.e. implement > > struct tcp_congestion_ops in BPF). > > > > There has been attempt to move the TCP CC to the user space > > (e.g. CCP in TCP). The common arguments are faster turn around, > > get away from long-tail kernel versions in production...etc, > > which are legit points. > > > > BPF has been the continuous effort to join both kernel and > > userspace upsides together (e.g. XDP to gain the performance > > advantage without bypassing the kernel). The recent BPF > > advancements (in particular BTF-aware verifier, BPF trampoline, > > BPF CO-RE...) made implementing kernel struct ops (e.g. tcp cc) > > possible in BPF. > > > > The idea is to allow implementing tcp_congestion_ops in bpf. > > It allows a faster turnaround for testing algorithm in the > > production while leveraging the existing (and continue growing) BPF > > feature/framework instead of building one specifically for > > userspace TCP CC. > > > > Please see individual patch for details. > > > > The bpftool support will be posted in follow-up patches. > > > > v4: > > - Expose tcp_ca_find() to tcp.h in patch 7. > > It is used to check the same bpf-tcp-cc > > does not exist to guarantee the register() > > will succeed. > > - set_memory_ro() and then set_memory_x() only after all > > trampolines are written to the image in patch 6. (Daniel) > > spinlock is replaced by mutex because set_memory_* > > requires sleepable context. > > Applied. Thanks > > Please address any future follow up > and please remember to provide 'why' details in commit log > no matter how obvious the patch looks as Arnaldo pointed out. > > Re: 'bpftool module' command. > I think it's too early to call anything 'a module', > since in this context people will immediately assume 'a kernel module' > It's a loaded phrase with a lot of consequences. > bpf-tcp-cc cubic and dctcp do look like kernel modules, but they are > not kernel modules at all. imo making them look like kernel modules > has plenty of downsides. > So as an immediate followup for bpftool I'd recommend to stick > with 'bpftool struct_ops' command or whatever other name. > Just not 'bpftool module'. > > I think there is a makefile issue with selftests. > make clean;make -j50 > kept failing for me three times in a row with: > progs/bpf_dctcp.c:138:4: warning: implicit declaration of function > 'bpf_tcp_send_ack' is invalid in C99 [-Wimplicit-function-declaration] > bpf_tcp_send_ack(sk, *prior_rcv_nxt); > but single threaded 'make clean;make' succeeded. > Andrii, you have a chance to take a look and reproduce would be great. Spotted few problems that might have caused this, will send a fix shortly.