On Fri, May 14, 2021 at 6:06 AM xufeng zhang <yunbo.xufeng@xxxxxxxxxxxxxxxxx> wrote: > > > 在 2021/5/13 上午6:55, Alexei Starovoitov 写道: > > On Wed, May 12, 2021 at 05:58:23PM +0800, Xufeng Zhang wrote: > >> To implement security rules for application containers by utilizing > >> bpf LSM, the container to which the current running task belongs need > >> to be known in bpf context. Think about this scenario: kubernetes > >> schedules a pod into one host, before the application container can run, > >> the security rules for this application need to be loaded into bpf > >> maps firstly, so that LSM bpf programs can make decisions based on > >> this rule maps. > >> > >> However, there is no effective bpf helper to achieve this goal, > >> especially for cgroup v1. In the above case, the only available information > >> from user side is container-id, and the cgroup path for this container > >> is certain based on container-id, so in order to make a bridge between > >> user side and bpf programs, bpf programs also need to know the current > >> cgroup path of running task. > > ... > >> +#ifdef CONFIG_CGROUPS > >> +BPF_CALL_2(bpf_get_current_cpuset_cgroup_path, char *, buf, u32, buf_len) > >> +{ > >> + struct cgroup_subsys_state *css; > >> + int retval; > >> + > >> + css = task_get_css(current, cpuset_cgrp_id); > >> + retval = cgroup_path_ns(css->cgroup, buf, buf_len, &init_cgroup_ns); > >> + css_put(css); > >> + if (retval >= buf_len) > >> + retval = -ENAMETOOLONG; > > Manipulating string path to check the hierarchy will be difficult to do > > inside bpf prog. It seems to me this helper will be useful only for > > simplest cgroup setups where there is no additional cgroup nesting > > within containers. > > Have you looked at *ancestor_cgroup_id and *cgroup_id helpers? > > They're a bit more flexible when dealing with hierarchy and > > can be used to achieve the same correlation between kernel and user cgroup ids. > > > KP, > > do you have any suggestion? I haven't really tried this yet, but have you considered using task local storage to identify the container? - Add a task local storage with container ID somewhere in the container manager - Propagate this ID to all the tasks within a container using task security blob management hooks (like task_alloc and task_free) etc. > > what I am thinking is the internal kernel object(cgroup id or ns.inum) > is not so user friendly, we can get the container-context from them for > tracing scenario, but not for LSM blocking cases, I'm not sure how > Google internally resolve similar issue. > > > Thanks! > > Xufeng >