On Fri, Aug 28, 2020 at 12:01 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > Introduce sleepable BPF programs that can request such property for themselves > via BPF_F_SLEEPABLE flag at program load time. In such case they will be able > to use helpers like bpf_copy_from_user() that might sleep. At present only > fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only > when they are attached to kernel functions that are known to allow sleeping. > > The non-sleepable programs are relying on implicit rcu_read_lock() and > migrate_disable() to protect life time of programs, maps that they use and > per-cpu kernel structures used to pass info between bpf programs and the > kernel. The sleepable programs cannot be enclosed into rcu_read_lock(). > migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs > should not be enclosed in migrate_disable() as well. Therefore > rcu_read_lock_trace is used to protect the life time of sleepable progs. > > There are many networking and tracing program types. In many cases the > 'struct bpf_prog *' pointer itself is rcu protected within some other kernel > data structure and the kernel code is using rcu_dereference() to load that > program pointer and call BPF_PROG_RUN() on it. All these cases are not touched. > Instead sleepable bpf programs are allowed with bpf trampoline only. The > program pointers are hard-coded into generated assembly of bpf trampoline and > synchronize_rcu_tasks_trace() is used to protect the life time of the program. > The same trampoline can hold both sleepable and non-sleepable progs. > > When rcu_read_lock_trace is held it means that some sleepable bpf program is > running from bpf trampoline. Those programs can use bpf arrays and preallocated > hash/lru maps. These map types are waiting on programs to complete via > synchronize_rcu_tasks_trace(); > > Updates to trampoline now has to do synchronize_rcu_tasks_trace() and > synchronize_rcu_tasks() to wait for sleepable progs to finish and for > trampoline assembly to finish. > > This is the first step of introducing sleepable progs. Eventually dynamically > allocated hash maps can be allowed and networking program types can become > sleepable too. > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > Acked-by: Andrii Nakryiko <andriin@xxxxxx> Acked-by: KP Singh <kpsingh@xxxxxxxxxx> Thanks for kicking off the allow list. I will continue my analysis looking at which hooks are sleepable and we can, eventually, generalize the information into lsm_hook_defs.h