Re: [Patch bpf-next v3 3/9] udp: implement ->sendmsg_locked()

Cong Wang <xiyou.wangcong@xxxxxxxxx> · Sat, 6 Mar 2021 10:34:14 -0800

On Fri, Mar 5, 2021 at 5:21 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote:
>
> Cong Wang wrote:
> > From: Cong Wang <cong.wang@xxxxxxxxxxxxx>
> >
> > UDP already has udp_sendmsg() which takes lock_sock() inside.
> > We have to build ->sendmsg_locked() on top of it, by adding
> > a new parameter for whether the sock has been locked.
> >
> > Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > Cc: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx>
> > Cc: Lorenz Bauer <lmb@xxxxxxxxxxxxxx>
> > Signed-off-by: Cong Wang <cong.wang@xxxxxxxxxxxxx>
> > ---
> >  include/net/udp.h  |  1 +
> >  net/ipv4/af_inet.c |  1 +
> >  net/ipv4/udp.c     | 30 +++++++++++++++++++++++-------
> >  3 files changed, 25 insertions(+), 7 deletions(-)
>
> [...]
>
> > -int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> > +static int __udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len, bool locked)
> >  {
>
> The lock_sock is also taken by BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK() in
> udp_sendmsg(),
>
>  if (cgroup_bpf_enabled(BPF_CGROUP_UDP4_SENDMSG) && !connected) {
>     err = BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk,
>                                     (struct sockaddr *)usin, &ipc.addr);
>
> so that will also need to be handled.

Indeed, good catch!

>
> It also looks like sk_dst_set() wants the sock lock to be held, but I'm not
> seeing how its covered in the current code,
>
>  static inline void
>  __sk_dst_set(struct sock *sk, struct dst_entry *dst)
>  {
>         struct dst_entry *old_dst;
>
>         sk_tx_queue_clear(sk);
>         sk->sk_dst_pending_confirm = 0;
>         old_dst = rcu_dereference_protected(sk->sk_dst_cache,
>                                             lockdep_sock_is_held(sk));
>         rcu_assign_pointer(sk->sk_dst_cache, dst);
>         dst_release(old_dst);
>  }

I do not see how __sk_dst_set() is called in udp_sendmsg().

>
> I guess this could trip lockdep now, I'll dig a bit more Monday and see
> if its actually the case.
>
> In general I don't really like code that wraps locks in 'if' branches
> like this. It seem fragile to me. I didn't walk every path in the code

I do not like it either, actually I spent quite some time trying to
get rid of this lock_sock, it is definitely not easy. The comment in
sk_psock_backlog() is clearly wrong, we do not lock_sock to keep
sk_socket, we lock it to protect other structures like
ingress_{skb,msg}.

> to see if a lock is taken in any of the called functions but it looks
> like ip_send_skb() can call into netfilter code and may try to take
> the sock lock.

Are you saying skb_send_sock_locked() is buggy? If so, clearly not
my fault.

>
> Do we need this locked send at all? We use it in sk_psock_backlog
> but that routine needs an optimization rewrite for TCP anyways.
> Its dropping a lot of performance on the floor for no good reason.

At least for ingress_msg. It is not as easy as adding a queue lock here,
because we probably want to retrieve atomically with the receive queue
together.

Thanks.