On Tue, May 30, 2023 at 8:43 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > John Fastabend wrote: > > Eric Dumazet wrote: > > > On Tue, May 23, 2023 at 4:56 AM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > > > > > When TCP stack has data ready to read sk_data_ready() is called. Sockmap > > > > overwrites this with its own handler to call into BPF verdict program. > > > > But, the original TCP socket had sock_def_readable that would additionally > > > > wake up any user space waiters with sk_wake_async(). > > > > > > > > Sockmap saved the callback when the socket was created so call the saved > > > > data ready callback and then we can wake up any epoll() logic waiting > > > > on the read. > > > > > > > > Note we call on 'copied >= 0' to account for returning 0 when a FIN is > > > > received because we need to wake up user for this as well so they > > > > can do the recvmsg() -> 0 and detect the shutdown. > > > > > > > > Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") > > > > Reviewed-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > > > > Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx> > > > > --- > > > > net/core/skmsg.c | 11 ++++++++++- > > > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > > index bcd45a99a3db..08be5f409fb8 100644 > > > > --- a/net/core/skmsg.c > > > > +++ b/net/core/skmsg.c > > > > @@ -1199,12 +1199,21 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb) > > > > static void sk_psock_verdict_data_ready(struct sock *sk) > > > > { > > > > struct socket *sock = sk->sk_socket; > > > > + int copied; > > > > > > > > trace_sk_data_ready(sk); > > > > > > > > if (unlikely(!sock || !sock->ops || !sock->ops->read_skb)) > > > > return; > > > > - sock->ops->read_skb(sk, sk_psock_verdict_recv); > > > > + copied = sock->ops->read_skb(sk, sk_psock_verdict_recv); > > > > + if (copied >= 0) { > > > > + struct sk_psock *psock; > > > > + > > > > + rcu_read_lock(); > > > > + psock = sk_psock(sk); > > > > + psock->saved_data_ready(sk); > > > > + rcu_read_unlock(); > > > > + } > > > > } > > > > > > > > void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock) > > > > -- > > > > 2.33.0 > > > > > > > > > > It seems psock could be NULL here, right ? > > > > > > What do you think if I submit the following fix ? > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > index a9060e1f0e4378fa47cfd375b4729b5b0a9f54ec..a29508e1ff3568583263b9307f7b1a0e814ba76d > > > 100644 > > > --- a/net/core/skmsg.c > > > +++ b/net/core/skmsg.c > > > @@ -1210,7 +1210,8 @@ static void sk_psock_verdict_data_ready(struct sock *sk) > > > > > > rcu_read_lock(); > > > psock = sk_psock(sk); > > > - psock->saved_data_ready(sk); > > > + if (psock) > > > + psock->saved_data_ready(sk); > > > rcu_read_unlock(); > > > } > > > } > > > > Yes please do presumably this is plausible if user delete map entry while > > data is being sent and we get a race. We don't have any tests for this > > in our CI though because we never delete socks after adding them and > > rely on the sock close. This shouldn't happen in that path because of the > > data_ready is blocked on SOCK_DEAD flag iirc. > > > > I'll think if we can add some stress test to add map update/delete in > > a tight loop with live socket sending/receiving traffic. > > > > Thanks > > I can also submit it if its easier just let me know. I will, this is based on a syzbot report I will also release, thanks !