On Wed, Sep 7, 2022 at 9:19 AM Leonard Crestez <cdleonard@xxxxxxxxx> wrote: > > On 9/7/22 01:57, Eric Dumazet wrote: > > On Mon, Sep 5, 2022 at 12:06 AM Leonard Crestez <cdleonard@xxxxxxxxx> wrote: > >> > >> This commit adds support to add and remove keys but does not use them > >> further. > >> > >> Similar to tcp md5 a single pointer to a struct tcp_authopt_info* struct > >> is added to struct tcp_sock, this avoids increasing memory usage. The > >> data structures related to tcp_authopt are initialized on setsockopt and > >> only freed on socket close. > >> > > > > Thanks Leonard. > > > > Small points from my side, please find them attached. > > ... > > >> +/* Free info and keys. > >> + * Don't touch tp->authopt_info, it might not even be assigned yes. > >> + */ > >> +void tcp_authopt_free(struct sock *sk, struct tcp_authopt_info *info) > >> +{ > >> + kfree_rcu(info, rcu); > >> +} > >> + > >> +/* Free everything and clear tcp_sock.authopt_info to NULL */ > >> +void tcp_authopt_clear(struct sock *sk) > >> +{ > >> + struct tcp_authopt_info *info; > >> + > >> + info = rcu_dereference_protected(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk)); > >> + if (info) { > >> + tcp_authopt_free(sk, info); > >> + tcp_sk(sk)->authopt_info = NULL; > > > > RCU rules at deletion mandate that the pointer must be cleared before > > the call_rcu()/kfree_rcu() call. > > > > It is possible that current MD5 code has an issue here, let's not copy/paste it. > > OK. Is there a need for some special form of assignment or is current > plain form enough? It is the right way (when clearing the pointer), no need for another form. > > > > >> + } > >> +} > >> + > >> +/* checks that ipv4 or ipv6 addr matches. */ > >> +static bool ipvx_addr_match(struct sockaddr_storage *a1, > >> + struct sockaddr_storage *a2) > >> +{ > >> + if (a1->ss_family != a2->ss_family) > >> + return false; > >> + if (a1->ss_family == AF_INET && > >> + (((struct sockaddr_in *)a1)->sin_addr.s_addr != > >> + ((struct sockaddr_in *)a2)->sin_addr.s_addr)) > >> + return false; > >> + if (a1->ss_family == AF_INET6 && > >> + !ipv6_addr_equal(&((struct sockaddr_in6 *)a1)->sin6_addr, > >> + &((struct sockaddr_in6 *)a2)->sin6_addr)) > >> + return false; > >> + return true; > >> +} > > > > Always surprising to see this kind of generic helper being added in a patch. > > I remember looking for an equivalent and not finding it. Many places > have distinct code paths for ipv4 and ipv6 and my use of > "sockaddr_storage" as ipv4/ipv6 union is uncommon. inetpeer_addr_cmp() might do it (and we also could fix a bug in it it seems, at least for __tcp_get_metrics() usage :/ > > It also wastes some memory. > > >> +int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt) > >> +{ > >> + struct tcp_sock *tp = tcp_sk(sk); > >> + struct tcp_authopt_info *info; > >> + > >> + memset(opt, 0, sizeof(*opt)); > >> + sock_owned_by_me(sk); > >> + > >> + info = rcu_dereference_check(tp->authopt_info, lockdep_sock_is_held(sk)); > > > > Probably not a big deal, but it seems the prior sock_owned_by_me() > > might be redundant. > > The sock_owned_by_me call checks checks lockdep_sock_is_held > > The rcu_dereference_check call checks lockdep_sock_is_held || > rcu_read_lock_held() Then if you own the socket lock, no need for rcu_dereference_check() It could be instead an rcu_dereference_protected(). This is stronger, because if your thread no longer owns the socket lock, but is inside rcu_read_lock(), we would still get a proper lockdep splat. > > This is a getsockopt so caller ensures socket locking but > rcu_read_lock_held() == 0. > > The sock_owned_by_me is indeed redundant because it seems very unlikely > the sockopt calling conditions will be changes. It was mostly there to > clarify for myself because I had probably at one time with locking > warnings. I guess they can be removed. > > >> +int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen) > >> +{ > >> + struct tcp_authopt_key opt; > >> + struct tcp_authopt_info *info; > >> + struct tcp_authopt_key_info *key_info, *old_key_info; > >> + struct netns_tcp_authopt *net = sock_net_tcp_authopt(sk); > >> + int err; > >> + > >> + sock_owned_by_me(sk); > >> + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) > >> + return -EPERM; > >> + > >> + err = _copy_from_sockptr_tolerant((u8 *)&opt, sizeof(opt), optval, optlen); > >> + if (err) > >> + return err; > >> + > >> + if (opt.flags & ~TCP_AUTHOPT_KEY_KNOWN_FLAGS) > >> + return -EINVAL; > >> + > >> + if (opt.keylen > TCP_AUTHOPT_MAXKEYLEN) > >> + return -EINVAL; > >> + > >> + /* Delete is a special case: */ > >> + if (opt.flags & TCP_AUTHOPT_KEY_DEL) { > >> + mutex_lock(&net->mutex); > >> + key_info = tcp_authopt_key_lookup_exact(sk, net, &opt); > >> + if (key_info) { > >> + tcp_authopt_key_del(net, key_info); > >> + err = 0; > >> + } else { > >> + err = -ENOENT; > >> + } > >> + mutex_unlock(&net->mutex); > >> + return err; > >> + } > >> + > >> + /* check key family */ > >> + if (opt.flags & TCP_AUTHOPT_KEY_ADDR_BIND) { > >> + if (sk->sk_family != opt.addr.ss_family) > >> + return -EINVAL; > >> + } > >> + > >> + /* Initialize tcp_authopt_info if not already set */ > >> + info = __tcp_authopt_info_get_or_create(sk); > >> + if (IS_ERR(info)) > >> + return PTR_ERR(info); > >> + > >> + key_info = kmalloc(sizeof(*key_info), GFP_KERNEL | __GFP_ZERO); > > > > kzalloc() ? > > Yes > > >> +static int tcp_authopt_init_net(struct net *full_net) > > > > Hmmm... our convention is to use "struct net *net" > > > >> +{ > >> + struct netns_tcp_authopt *net = &full_net->tcp_authopt; > > > > Here, you should use a different name ... > > OK, will replace with net_ao > > >> @@ -2267,10 +2268,11 @@ void tcp_v4_destroy_sock(struct sock *sk) > >> tcp_clear_md5_list(sk); > >> kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu); > >> tp->md5sig_info = NULL; > >> } > >> #endif > >> + tcp_authopt_clear(sk); > > > > Do we really own the socket lock at this point ? > > Not sure how I would tell but there is a lockdep_sock_is_held check > inside tcp_authopt_clear. I also added sock_owned_by_me and there were > no warnings. Ok then :)