On 8/31/2023 8:09 PM, Daniel Borkmann wrote:
On 8/31/23 3:31 AM, Xu Kuohai wrote:
From: Xu Kuohai <xukuohai@xxxxxxxxxx>
While commit 90f0074cd9f9 ("selftests/bpf: fix a CI failure caused by vsock sockmap test")
fixes a receive failure of vsock sockmap test, there is still a write failure:
Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir
Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir
./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected
vsock_unix_redir_connectible:FAIL:1501
./test_progs:vsock_unix_redir_connectible:1501: ingress: write: Transport endpoint is not connected
vsock_unix_redir_connectible:FAIL:1501
./test_progs:vsock_unix_redir_connectible:1501: egress: write: Transport endpoint is not connected
vsock_unix_redir_connectible:FAIL:1501
The reason is that the vsock connection in the test is set to ESTABLISHED state
by function virtio_transport_recv_pkt, which is executed in a workqueue thread,
so when the user space test thread runs before the workqueue thread, this
problem occurs.
To fix it, before writing the connection, wait for it to be connected.
Fixes: d61bd8c1fd02 ("selftests/bpf: add a test case for vsock sockmap")
Signed-off-by: Xu Kuohai <xukuohai@xxxxxxxxxx>
Thanks for the fix! Looks like this is gone now at least in the tests which succeed,
but there are still two issues:
1) s390x fails in BPF CI as below:
https://github.com/kernel-patches/bpf/actions/runs/6031993528/job/16366784236
Error: #211 sockmap_listen
Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir
Error: #211/79 sockmap_listen/sockmap VSOCK test_vsock_redir
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
Error: #211/158 sockmap_listen/sockhash VSOCK test_vsock_redir
Error: #211/158 sockmap_listen/sockhash VSOCK test_vsock_redir
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
./test_progs:vsock_socketpair_connectible:1456: poll_connect: Invalid argument
vsock_socketpair_connectible:FAIL:1456
./test_progs:vsock_unix_redir_connectible:1494: vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1494
Oops, I think it's because the esize variable is not initialized,
causing getsockopt to read a garbage value.
2) Various panics, some GPFs but also seen NULL pointer derefs, discussed in the other
thread: https://lore.kernel.org/bpf/ZO+RQwJhPhYcNGAi@krava/
still debugging ...
I believe issue 1) might still be related to your fix in here, ptal.
Sorry for introducing issue 1), will post a fix soon.
Thanks,
Daniel
.