On Wed, May 19, 2021 at 2:56 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > Cong Wang wrote: > > From: Cong Wang <cong.wang@xxxxxxxxxxxxx> > > > > We use non-blocking sockets for testing sockmap redirections, > > and got some random EAGAIN errors from UDP tests. > > > > There is no guarantee the packet would be immediately available > > to receive as soon as it is sent out, even on the local host. > > For UDP, this is especially true because it does not lock the > > sock during BH (unlike the TCP path). This is probably why we > > only saw this error in UDP cases. > > > > No matter how hard we try to make the queue empty check accurate, > > it is always possible for recvmsg() to beat ->sk_data_ready(). > > Therefore, we should just retry in case of EAGAIN. > > > > Fixes: d6378af615275 ("selftests/bpf: Add a test case for udp sockmap") > > Reported-by: Jiang Wang <jiang.wang@xxxxxxxxxxxxx> > > Cc: John Fastabend <john.fastabend@xxxxxxxxx> > > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > > Cc: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > > Cc: Lorenz Bauer <lmb@xxxxxxxxxxxxxx> > > Signed-off-by: Cong Wang <cong.wang@xxxxxxxxxxxxx> > > --- > > tools/testing/selftests/bpf/prog_tests/sockmap_listen.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c > > index 648d9ae898d2..b1ed182c4720 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c > > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c > > @@ -1686,9 +1686,13 @@ static void udp_redir_to_connected(int family, int sotype, int sock_mapfd, > > if (pass != 1) > > FAIL("%s: want pass count 1, have %d", log_prefix, pass); > > > > +again: > > n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); > > - if (n < 0) > > + if (n < 0) { > > + if (errno == EAGAIN) > > + goto again; > > FAIL_ERRNO("%s: read", log_prefix); > > Needs a counter and abort logic we don't want to loop forever in the > case the packet is lost. It should not be lost because selftests must be self-contained, if the selftests could not even predict whether its own packet is lost or not, we would have a much bigger trouble than just this infinite loop. Thanks.