Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor

Yonghong Song <yhs@xxxxxx> · Mon, 23 May 2022 08:41:38 -0700

On 5/20/22 5:03 PM, Stanislav Fomichev wrote:
On Thu, May 19, 2022 at 6:01 PM Yonghong Song <yhs@xxxxxx> wrote:



On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
Allow attaching to lsm hooks in the cgroup context.

Attaching to per-cgroup LSM works exactly like attaching
to other per-cgroup hooks. New BPF_LSM_CGROUP is added
to trigger new mode; the actual lsm hook we attach to is
signaled via existing attach_btf_id.

For the hooks that have 'struct socket' or 'struct sock' as its first
argument, we use the cgroup associated with that socket. For the rest,
we use 'current' cgroup (this is all on default hierarchy == v2 only).
Note that for some hooks that work on 'struct sock' we still
take the cgroup from 'current' because some of them work on the socket
that hasn't been properly initialized yet.

Behind the scenes, we allocate a shim program that is attached
to the trampoline and runs cgroup effective BPF programs array.
This shim has some rudimentary ref counting and can be shared
between several programs attaching to the same per-cgroup lsm hook.

Note that this patch bloats cgroup size because we add 211
cgroup_bpf_attach_type(s) for simplicity sake. This will be
addressed in the subsequent patch.

Also note that we only add non-sleepable flavor for now. To enable
sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
shim programs have to be freed via trace rcu, cgroup_bpf.effective
should be also trace-rcu-managed + maybe some other changes that
I'm not aware of.

Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
---
   arch/x86/net/bpf_jit_comp.c     |  24 +++--
   include/linux/bpf-cgroup-defs.h |   6 ++
   include/linux/bpf-cgroup.h      |   7 ++
   include/linux/bpf.h             |  25 +++++
   include/linux/bpf_lsm.h         |  14 +++
   include/linux/btf_ids.h         |   3 +-
   include/uapi/linux/bpf.h        |   1 +
   kernel/bpf/bpf_lsm.c            |  50 +++++++++
   kernel/bpf/btf.c                |  11 ++
   kernel/bpf/cgroup.c             | 181 ++++++++++++++++++++++++++++---
   kernel/bpf/core.c               |   2 +
   kernel/bpf/syscall.c            |  10 ++
   kernel/bpf/trampoline.c         | 184 ++++++++++++++++++++++++++++++++
   kernel/bpf/verifier.c           |  28 +++++
   tools/include/linux/btf_ids.h   |   4 +-
   tools/include/uapi/linux/bpf.h  |   1 +
   16 files changed, 527 insertions(+), 24 deletions(-)

A few nits below.

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index a2b6d197c226..5cdebf4312da 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1765,6 +1765,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
                          struct bpf_tramp_link *l, int stack_size,
                          int run_ctx_off, bool save_ret)
   {
+     void (*exit)(struct bpf_prog *prog, u64 start,
+                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
+     u64 (*enter)(struct bpf_prog *prog,
+                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
       u8 *prog = *pprog;
       u8 *jmp_insn;
       int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
[...]
       return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 134785ab487c..2c356a38f4cf 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -14,6 +14,9 @@
   #include <linux/string.h>
   #include <linux/bpf.h>
   #include <linux/bpf-cgroup.h>
+#include <linux/btf_ids.h>
+#include <linux/bpf_lsm.h>
+#include <linux/bpf_verifier.h>
   #include <net/sock.h>
   #include <net/bpf_sk_storage.h>

@@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
       return run_ctx.retval;
   }

+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+                                    const struct bpf_insn *insn)
+{
+     const struct bpf_prog *shim_prog;
+     struct sock *sk;
+     struct cgroup *cgrp;
+     int ret = 0;
+     u64 *regs;
+
+     regs = (u64 *)ctx;
+     sk = (void *)(unsigned long)regs[BPF_REG_0];

Maybe just my own opinion. Using BPF_REG_0 as index is a little bit
confusing. Maybe just use '0' to indicate the first parameters.
Maybe change 'regs' to 'params' is also a better choice?
In reality, trampline just passed an array of parameters to
the program. The same for a few places below.

Sure, let's rename it and use 0. I'll do args instead of params maybe?

'args' works for me too.


+     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));

I didn't experiment, but why container_of won't work?

There is a type check in container_of that doesn't seem to work for flex arrays:

kernel/bpf/cgroup.c:78:14: error: static_assert failed due to
requirement '__builtin_types_compatible_p(const struct bpf_insn,
struct bpf_insn []"
         shim_prog = container_of(insn, struct bpf_prog, insnsi);
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/container_of.h:19:2: note: expanded from macro 'container_of'
         static_assert(__same_type(*(ptr), ((type *)0)->member) ||       \
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
#define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
#define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                         ^              ~~~~
1 error generated.

You are right. Thanks for explanation.

[...]