On Fri, Mar 5, 2021 at 5:21 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > Cong Wang wrote: > > From: Cong Wang <cong.wang@xxxxxxxxxxxxx> > > > > UDP already has udp_sendmsg() which takes lock_sock() inside. > > We have to build ->sendmsg_locked() on top of it, by adding > > a new parameter for whether the sock has been locked. > > > > Cc: John Fastabend <john.fastabend@xxxxxxxxx> > > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > > Cc: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > > Cc: Lorenz Bauer <lmb@xxxxxxxxxxxxxx> > > Signed-off-by: Cong Wang <cong.wang@xxxxxxxxxxxxx> > > --- > > include/net/udp.h | 1 + > > net/ipv4/af_inet.c | 1 + > > net/ipv4/udp.c | 30 +++++++++++++++++++++++------- > > 3 files changed, 25 insertions(+), 7 deletions(-) > > [...] > > > -int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > > +static int __udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len, bool locked) > > { > > The lock_sock is also taken by BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK() in > udp_sendmsg(), > > if (cgroup_bpf_enabled(BPF_CGROUP_UDP4_SENDMSG) && !connected) { > err = BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, > (struct sockaddr *)usin, &ipc.addr); > > so that will also need to be handled. Indeed, good catch! > > It also looks like sk_dst_set() wants the sock lock to be held, but I'm not > seeing how its covered in the current code, > > static inline void > __sk_dst_set(struct sock *sk, struct dst_entry *dst) > { > struct dst_entry *old_dst; > > sk_tx_queue_clear(sk); > sk->sk_dst_pending_confirm = 0; > old_dst = rcu_dereference_protected(sk->sk_dst_cache, > lockdep_sock_is_held(sk)); > rcu_assign_pointer(sk->sk_dst_cache, dst); > dst_release(old_dst); > } I do not see how __sk_dst_set() is called in udp_sendmsg(). > > I guess this could trip lockdep now, I'll dig a bit more Monday and see > if its actually the case. > > In general I don't really like code that wraps locks in 'if' branches > like this. It seem fragile to me. I didn't walk every path in the code I do not like it either, actually I spent quite some time trying to get rid of this lock_sock, it is definitely not easy. The comment in sk_psock_backlog() is clearly wrong, we do not lock_sock to keep sk_socket, we lock it to protect other structures like ingress_{skb,msg}. > to see if a lock is taken in any of the called functions but it looks > like ip_send_skb() can call into netfilter code and may try to take > the sock lock. Are you saying skb_send_sock_locked() is buggy? If so, clearly not my fault. > > Do we need this locked send at all? We use it in sk_psock_backlog > but that routine needs an optimization rewrite for TCP anyways. > Its dropping a lot of performance on the floor for no good reason. At least for ingress_msg. It is not as easy as adding a queue lock here, because we probably want to retrieve atomically with the receive queue together. Thanks.