Daniel Borkmann wrote: > On Wed, Jan 08, 2020 at 12:57:08PM +0800, Lingpeng Chen wrote: > > Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue > > if not empty than psock->ingress_msg otherwise. If a FIN packet arrives > > and there's also some data in psock->ingress_msg, the data in > > psock->ingress_msg will be purged. It is always happen when request to a > > HTTP1.0 server like python SimpleHTTPServer since the server send FIN > > packet after data is sent out. > > > > Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") > > Reported-by: Arika Chen <eaglesora@xxxxxxxxx> > > Suggested-by: Arika Chen <eaglesora@xxxxxxxxx> > > Signed-off-by: Lingpeng Chen <forrest0579@xxxxxxxxx> > > Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx> > > --- > > net/ipv4/tcp_bpf.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > > index e38705165ac9..f7e902868fce 100644 > > --- a/net/ipv4/tcp_bpf.c > > +++ b/net/ipv4/tcp_bpf.c > > @@ -123,12 +123,13 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, > > > > if (unlikely(flags & MSG_ERRQUEUE)) > > return inet_recv_error(sk, msg, len, addr_len); > > Shouldn't we also move the error queue handling below the psock test as > well and let tcp_recvmsg() natively do it in case of !psock? > You mean the MSG_ERRQUEUE flag handling? If the user sets MSG_ERRQUEUE they expect to receive any queued errors it would be wrong to return psock data in this case if psock is attached and has data on queue and user passes MSG_ERRQUEUE flag. MSG_ERRQUEUE (since Linux 2.2) This flag specifies that queued errors should be received from the socket error queue. The error is passed in an ancillary message with a type dependent on the protocol (for IPv4 IP_RECVERR). The user should supply a buffer of sufficient size. See cmsg(3) and ip(7) for more information. The payload of the original packet that caused the error is passed as normal data via msg_iovec. The original destination address of the datagram that caused the error is supplied via msg_name. I believe it needs to be where it is. > > - if (!skb_queue_empty(&sk->sk_receive_queue)) > > - return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > > > > psock = sk_psock_get(sk); > > if (unlikely(!psock)) > > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > > + if (!skb_queue_empty(&sk->sk_receive_queue) && > > + sk_psock_queue_empty(psock)) > > + return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > > lock_sock(sk); > > msg_bytes_ready: > > copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); > > @@ -139,7 +140,7 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, > > timeo = sock_rcvtimeo(sk, nonblock); > > data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err); > > if (data) { > > - if (skb_queue_empty(&sk->sk_receive_queue)) > > + if (!sk_psock_queue_empty(psock)) > > goto msg_bytes_ready; > > release_sock(sk); > > sk_psock_put(sk, psock); > > -- > > 2.17.1 > >