John Fastabend wrote: > Eric Dumazet wrote: > > On Tue, May 23, 2023 at 4:56 AM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > > > > > When TCP stack has data ready to read sk_data_ready() is called. Sockmap > > > overwrites this with its own handler to call into BPF verdict program. > > > But, the original TCP socket had sock_def_readable that would additionally > > > wake up any user space waiters with sk_wake_async(). > > > > > > Sockmap saved the callback when the socket was created so call the saved > > > data ready callback and then we can wake up any epoll() logic waiting > > > on the read. > > > > > > Note we call on 'copied >= 0' to account for returning 0 when a FIN is > > > received because we need to wake up user for this as well so they > > > can do the recvmsg() -> 0 and detect the shutdown. > > > > > > Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") > > > Reviewed-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > > > Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx> > > > --- > > > net/core/skmsg.c | 11 ++++++++++- > > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > index bcd45a99a3db..08be5f409fb8 100644 > > > --- a/net/core/skmsg.c > > > +++ b/net/core/skmsg.c > > > @@ -1199,12 +1199,21 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb) > > > static void sk_psock_verdict_data_ready(struct sock *sk) > > > { > > > struct socket *sock = sk->sk_socket; > > > + int copied; > > > > > > trace_sk_data_ready(sk); > > > > > > if (unlikely(!sock || !sock->ops || !sock->ops->read_skb)) > > > return; > > > - sock->ops->read_skb(sk, sk_psock_verdict_recv); > > > + copied = sock->ops->read_skb(sk, sk_psock_verdict_recv); > > > + if (copied >= 0) { > > > + struct sk_psock *psock; > > > + > > > + rcu_read_lock(); > > > + psock = sk_psock(sk); > > > + psock->saved_data_ready(sk); > > > + rcu_read_unlock(); > > > + } > > > } > > > > > > void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock) > > > -- > > > 2.33.0 > > > > > > > It seems psock could be NULL here, right ? > > > > What do you think if I submit the following fix ? > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > index a9060e1f0e4378fa47cfd375b4729b5b0a9f54ec..a29508e1ff3568583263b9307f7b1a0e814ba76d > > 100644 > > --- a/net/core/skmsg.c > > +++ b/net/core/skmsg.c > > @@ -1210,7 +1210,8 @@ static void sk_psock_verdict_data_ready(struct sock *sk) > > > > rcu_read_lock(); > > psock = sk_psock(sk); > > - psock->saved_data_ready(sk); > > + if (psock) > > + psock->saved_data_ready(sk); > > rcu_read_unlock(); > > } > > } > > Yes please do presumably this is plausible if user delete map entry while > data is being sent and we get a race. We don't have any tests for this > in our CI though because we never delete socks after adding them and > rely on the sock close. This shouldn't happen in that path because of the > data_ready is blocked on SOCK_DEAD flag iirc. > > I'll think if we can add some stress test to add map update/delete in > a tight loop with live socket sending/receiving traffic. > > Thanks I can also submit it if its easier just let me know.