On Tue, Mar 10, 2020 at 06:47 PM CET, Lorenz Bauer wrote: > Allow callers with CAP_NET_ADMIN to retrieve file descriptors from a > sockmap and sockhash. O_CLOEXEC is enforced on all fds. > > Without this, it's difficult to resize or otherwise rebuild existing > sockmap or sockhashes. > > Suggested-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > Signed-off-by: Lorenz Bauer <lmb@xxxxxxxxxxxxxx> > --- > net/core/sock_map.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/net/core/sock_map.c b/net/core/sock_map.c > index 03e04426cd21..3228936aa31e 100644 > --- a/net/core/sock_map.c > +++ b/net/core/sock_map.c > @@ -347,12 +347,31 @@ static void *sock_map_lookup(struct bpf_map *map, void *key) > static int __sock_map_copy_value(struct bpf_map *map, struct sock *sk, > void *value) > { > + struct file *file; > + int fd; > + > switch (map->value_size) { > case sizeof(u64): > sock_gen_cookie(sk); > *(u64 *)value = atomic64_read(&sk->sk_cookie); > return 0; > > + case sizeof(u32): > + if (!capable(CAP_NET_ADMIN)) > + return -EPERM; > + > + fd = get_unused_fd_flags(O_CLOEXEC); > + if (unlikely(fd < 0)) > + return fd; > + > + read_lock_bh(&sk->sk_callback_lock); > + file = get_file(sk->sk_socket->file); I think this deserves a second look. We don't lock the sock, so what if tcp_close orphans it before we enter this critical section? Looks like sk->sk_socket might be NULL. I'd find a test that tries to trigger the race helpful, like: thread A: loop in lookup FD from map thread B: loop in insert FD into map, close FD > + read_unlock_bh(&sk->sk_callback_lock); > + > + fd_install(fd, file); > + *(u32 *)value = fd; > + return 0; > + > default: > return -ENOSPC; > }