Stanislav Fomichev <sdf@xxxxxxxxxx> [Fri, 2021-01-22 11:54 -0800]: > On Fri, Jan 22, 2021 at 11:37 AM Andrey Ignatov <rdna@xxxxxx> wrote: > > > > Stanislav Fomichev <sdf@xxxxxxxxxx> [Wed, 2021-01-20 18:09 -0800]: > > > At the moment, BPF_CGROUP_INET{4,6}_BIND hooks can rewrite user_port > > > to the privileged ones (< ip_unprivileged_port_start), but it will > > > be rejected later on in the __inet_bind or __inet6_bind. > > > > > > Let's export 'port_changed' event from the BPF program and bypass > > > ip_unprivileged_port_start range check when we've seen that > > > the program explicitly overrode the port. This is accomplished > > > by generating instructions to set ctx->port_changed along with > > > updating ctx->user_port. > > > > > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > > > --- > > ... > > > @@ -244,17 +245,27 @@ int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, > > > if (cgroup_bpf_enabled(type)) { \ > > > lock_sock(sk); \ > > > __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \ > > > - t_ctx); \ > > > + t_ctx, NULL); \ > > > release_sock(sk); \ > > > } \ > > > __ret; \ > > > }) > > > > > > -#define BPF_CGROUP_RUN_PROG_INET4_BIND_LOCK(sk, uaddr) \ > > > - BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET4_BIND, NULL) > > > - > > > -#define BPF_CGROUP_RUN_PROG_INET6_BIND_LOCK(sk, uaddr) \ > > > - BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET6_BIND, NULL) > > > +#define BPF_CGROUP_RUN_PROG_INET_BIND_LOCK(sk, uaddr, type, flags) \ > > > +({ \ > > > + bool port_changed = false; \ > > > > I see the discussion with Martin in [0] on the program overriding the > > port but setting exactly same value as it already contains. Commenting > > on this patch since the code is here. > > > > From what I understand there is no use-case to support overriding the > > port w/o changing the value to just bypass the capability. In this case > > the code can be simplified. > > > > Here instead of introducing port_changed you can just remember the > > original ((struct sockaddr_in *)uaddr)->sin_port or > > ((struct sockaddr_in6 *)uaddr)->sin6_port (they have same offset/size so > > it can be simplified same way as in sock_addr_convert_ctx_access() for > > user_port) ... > > > > > + int __ret = 0; \ > > > + if (cgroup_bpf_enabled(type)) { \ > > > + lock_sock(sk); \ > > > + __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \ > > > + NULL, \ > > > + &port_changed); \ > > > + release_sock(sk); \ > > > + if (port_changed) \ > > > > ... and then just compare the original and the new ports here. > > > > The benefits will be: > > * no need to introduce port_changed field in struct bpf_sock_addr_kern; > > * no need to do change program instructions; > > * no need to think about compiler optimizing out those instructions; > > * no need to think about multiple programs coordination, the flag will > > be set only if port has actually changed what is easy to reason about > > from user perspective. > > > > wdyt? > Martin mentioned in another email that we might want to do that when > we rewrite only the address portion of it. > I think it makes sense. Imagine doing 1.1.1.1:50 -> 2.2.2.2:50 it > seems like it should also work, right? > And in this case, we need to store and compare addresses as well and > it becomes messy :-/ Why does address matter? CAP_NET_BIND_SERVICE is only about ports, not addresses. IMO address change should not matter to bypass CAP_NET_BIND_SERVICE in this case and correspondingly there should not be a need to compare addresses, only port should be enough. > It also seems like it would be nice to have this 'bypass > cap_net_bind_service" without changing the address while we are at it. Yeah, this part determines the behaviour. I guess it should be use-case driven. So far it seems to be more like "nice to have" rather than a real-use case exists, but I could miss it, please correct me if it's the case. -- Andrey Ignatov