[PATCH bpf-next 0/4] bpf: query effective progs without cgroup_mutex

Stanislav Fomichev <sdf@xxxxxxxxxx> · Thu, 11 May 2023 10:20:50 -0700

We're observing some stalls on the heavily loaded machines
in the cgroup_bpf_prog_query path. This is likely due to
being blocked on cgroup_mutex.

IIUC, the cgroup_mutex is there mostly to protect the non-effective
fields (cgrp->bpf.progs) which might be changed by the update path.
For the BPF_F_QUERY_EFFECTIVE case, all we need is to rcu_dereference
a bunch of pointers (and keep them around for consistency), so
let's do it.

Since RFC, I've added handling for non-effective case as well. It's
a bit more complicated, but converting prog hlist to rcu seems
to be all we need (unless I'm missing something). Plus, couple
of READ_ONCE/WRITE_ONCE for the flags to read them in a lockless
(racy) manner.

Stanislav Fomichev (4):
  bpf: export bpf_prog_array_copy_core
  rculist: add hlist_for_each_rcu
  bpf: refactor __cgroup_bpf_query
  bpf: query effective progs without cgroup_mutex

 include/linux/bpf-cgroup-defs.h |   2 +-
 include/linux/bpf-cgroup.h      |   1 +
 include/linux/bpf.h             |   2 +
 include/linux/rculist.h         |   6 ++
 kernel/bpf/cgroup.c             | 168 +++++++++++++++++++-------------
 kernel/bpf/core.c               |  14 ++-
 6 files changed, 114 insertions(+), 79 deletions(-)

-- 
2.40.1.521.gf1e218fcd8-goog