Stanislav Fomichev <sdf@xxxxxxxxxx> [Wed, 2021-01-20 18:09 -0800]: > At the moment, BPF_CGROUP_INET{4,6}_BIND hooks can rewrite user_port > to the privileged ones (< ip_unprivileged_port_start), but it will > be rejected later on in the __inet_bind or __inet6_bind. > > Let's export 'port_changed' event from the BPF program and bypass > ip_unprivileged_port_start range check when we've seen that > the program explicitly overrode the port. This is accomplished > by generating instructions to set ctx->port_changed along with > updating ctx->user_port. > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > --- ... > @@ -244,17 +245,27 @@ int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, > if (cgroup_bpf_enabled(type)) { \ > lock_sock(sk); \ > __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \ > - t_ctx); \ > + t_ctx, NULL); \ > release_sock(sk); \ > } \ > __ret; \ > }) > > -#define BPF_CGROUP_RUN_PROG_INET4_BIND_LOCK(sk, uaddr) \ > - BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET4_BIND, NULL) > - > -#define BPF_CGROUP_RUN_PROG_INET6_BIND_LOCK(sk, uaddr) \ > - BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET6_BIND, NULL) > +#define BPF_CGROUP_RUN_PROG_INET_BIND_LOCK(sk, uaddr, type, flags) \ > +({ \ > + bool port_changed = false; \ I see the discussion with Martin in [0] on the program overriding the port but setting exactly same value as it already contains. Commenting on this patch since the code is here. >From what I understand there is no use-case to support overriding the port w/o changing the value to just bypass the capability. In this case the code can be simplified. Here instead of introducing port_changed you can just remember the original ((struct sockaddr_in *)uaddr)->sin_port or ((struct sockaddr_in6 *)uaddr)->sin6_port (they have same offset/size so it can be simplified same way as in sock_addr_convert_ctx_access() for user_port) ... > + int __ret = 0; \ > + if (cgroup_bpf_enabled(type)) { \ > + lock_sock(sk); \ > + __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \ > + NULL, \ > + &port_changed); \ > + release_sock(sk); \ > + if (port_changed) \ ... and then just compare the original and the new ports here. The benefits will be: * no need to introduce port_changed field in struct bpf_sock_addr_kern; * no need to do change program instructions; * no need to think about compiler optimizing out those instructions; * no need to think about multiple programs coordination, the flag will be set only if port has actually changed what is easy to reason about from user perspective. wdyt? > + *flags |= BIND_NO_CAP_NET_BIND_SERVICE; \ > + } \ > + __ret; \ > +}) > > #define BPF_CGROUP_PRE_CONNECT_ENABLED(sk) \ > ((cgroup_bpf_enabled(BPF_CGROUP_INET4_CONNECT) || \ > @@ -453,8 +464,7 @@ static inline int bpf_percpu_cgroup_storage_update(struct bpf_map *map, > #define BPF_CGROUP_RUN_PROG_INET_EGRESS(sk,skb) ({ 0; }) > #define BPF_CGROUP_RUN_PROG_INET_SOCK(sk) ({ 0; }) > #define BPF_CGROUP_RUN_PROG_INET_SOCK_RELEASE(sk) ({ 0; }) > -#define BPF_CGROUP_RUN_PROG_INET4_BIND_LOCK(sk, uaddr) ({ 0; }) > -#define BPF_CGROUP_RUN_PROG_INET6_BIND_LOCK(sk, uaddr) ({ 0; }) > +#define BPF_CGROUP_RUN_PROG_INET_BIND_LOCK(sk, uaddr, type, flags) ({ 0; }) > #define BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk) ({ 0; }) > #define BPF_CGROUP_RUN_PROG_INET6_POST_BIND(sk) ({ 0; }) > #define BPF_CGROUP_RUN_PROG_INET4_CONNECT(sk, uaddr) ({ 0; }) ... [0] https://lore.kernel.org/bpf/20210121223330.pyk4ljtjirm2zlay@kafai-mbp/ -- Andrey Ignatov