Daniel Borkmann wrote: > On 1/8/20 7:01 PM, John Fastabend wrote: > > Daniel Borkmann wrote: > >> On Wed, Jan 08, 2020 at 12:57:08PM +0800, Lingpeng Chen wrote: > >>> Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue > >>> if not empty than psock->ingress_msg otherwise. If a FIN packet arrives > >>> and there's also some data in psock->ingress_msg, the data in > >>> psock->ingress_msg will be purged. It is always happen when request to a > >>> HTTP1.0 server like python SimpleHTTPServer since the server send FIN > >>> packet after data is sent out. > >>> > >>> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") > >>> Reported-by: Arika Chen <eaglesora@xxxxxxxxx> > >>> Suggested-by: Arika Chen <eaglesora@xxxxxxxxx> > >>> Signed-off-by: Lingpeng Chen <forrest0579@xxxxxxxxx> > >>> Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx> > >>> --- > >>> net/ipv4/tcp_bpf.c | 7 ++++--- > >>> 1 file changed, 4 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > >>> index e38705165ac9..f7e902868fce 100644 > >>> --- a/net/ipv4/tcp_bpf.c > >>> +++ b/net/ipv4/tcp_bpf.c > >>> @@ -123,12 +123,13 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, > >>> > >>> if (unlikely(flags & MSG_ERRQUEUE)) > >>> return inet_recv_error(sk, msg, len, addr_len); > >> > >> Shouldn't we also move the error queue handling below the psock test as > >> well and let tcp_recvmsg() natively do it in case of !psock? > >> > > > > You mean the MSG_ERRQUEUE flag handling? If the user sets MSG_ERRQUEUE > > they expect to receive any queued errors it would be wrong to return > > psock data in this case if psock is attached and has data on queue and > > user passes MSG_ERRQUEUE flag. > > > > MSG_ERRQUEUE (since Linux 2.2) > > This flag specifies that queued errors should be received from the socket > > error queue. The error is passed in an ancillary message with a type > > dependent on the protocol (for IPv4 IP_RECVERR). The user should supply > > a buffer of sufficient size. See cmsg(3) and ip(7) for more information. > > The payload of the original packet that caused the error is passed as > > normal data via msg_iovec. The original destination address of the > > datagram that caused the error is supplied via msg_name. > > > > I believe it needs to be where it is. > > I meant that it should have looked as follows (aka moving both below the > psock test) ... > > psock = sk_psock_get(sk); > if (unlikely(!psock)) > return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); > if (unlikely(flags & MSG_ERRQUEUE)) > return inet_recv_error(sk, msg, len, addr_len); > if (!skb_queue_empty(&sk->sk_receive_queue) && [...] > > ... since when detached it's handled already via tcp_recvmsg() internals. > Ah, OK the 'if (unlikely(!psock))' is a very rare case so I'm not sure it matters much. But, I agree it is slightly nicer. Lingpeng, can you spin a v2 moving the MSG_ERRQUEUE check down and also keep the ACK and stable tags I don't think its a very big change the Acks can stick around IMO. Thanks.