Thanks Stefano, The feedback about vsock expectations was exactly what I was hoping you could provide. In the Kata agent we're not directly setting SO_REUSEPORT as a socket option so I think what you suggest where SO_REUSEORT is being set indiscriminately is happening a layer down perhaps in the tokio or nix crates we use. I unfortunately do not have an easy way to reproduce the problem without setting up kata containers and what's more you need to then rebuild a recent kata flavoured minimal kernel to see the issue. I spent the day updating our build to use the latest kata container release and dependencies to see if that would correct the issue. Unfortunately that did not and so will work tomorrow to get stack traces etc. to more directly figure things out. For the others on the thread ... based on what Stefano said although throwing an error for vsocks is a change in behaviour I suspect this is a problem we can fix in a crate corrected to be more aware of vsock capabilities. I'll know better what's possible and update tomorrow. Thanks -Simon On Tue, Jan 21, 2025 at 4:54 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > > On Tue, 21 Jan 2025 at 10:26, Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > > > > Hi Simon, > > > > On Tue, 21 Jan 2025 at 05:53, Simon Kaegi <simon.kaegi@xxxxxxxxx> wrote: > > > > > > #regzbot introduced v6.6.69..v6.6.70 > > > #regzbot introduced: ad91a2dacbf8c26a446658cdd55e8324dfeff1e7 > > > > > > We hit this regression when updating our guest vm kernel from 6.6.69 > > > to 6.6.70 -- bisecting, this problem was introduced in > > > ad91a2dacbf8c26a446658cdd55e8324dfeff1e7 -- net: restrict SO_REUSEPORT > > > to inet sockets > > > > > > We're getting a timeout when trying to connect to the vsocket in the > > > guest VM when launching a kata containers 3.10.1 agent which > > > unsurprisingly ... uses a vsocket to communicate back to the host. > > > > > > We updated this commit and added an additional sk_is_vsock check and > > > recompiled and this works correctly for us. > > > - if (valbool && !sk_is_inet(sk)) > > > + if (valbool && !(sk_is_inet(sk) || sk_is_vsock(sk))) > > > > > > My understanding is limited here so I've added Stefano as he is likely > > > to better understand what makes sense here. > > > > Thanks for adding me, do you have a reproducer here? > > > > AFAIK in AF_VSOCK we never supported SO_REUSEPORT, so it seems strange to me. > > > > I understand that the patch you refer to actually changes the behavior > > of setsockopt(..., SO_REUSEPORT, ...) on an AF_VSOCK socket, where it > > used to return successfully before that change, but now returns an > > error, but subsequent binds should have still failed even without this > > patch. > > > > Do you actually use the SO_REUSEPORT feature on AF_VSOCK? > > > > If so, I need to better understand if the core socket does anything, > > but as I recall AF_VSOCK allocates ports internally, so I don't think > > multiple binds on the same port have ever been supported. > > I just tried on an old kernel without the patch applied, and I confirm > that SO_REUSEPORT was not supported also if the setsockopt() was > successful. > > I run the following snippet on 2 shell, on the first one everything > fine, but on the second the bind() fails in this way: > > $ uname -r > 6.10.11-200.fc40.x86_64 > $ python3 > >>> import socket > >>> import os > >>> s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM) > >>> s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) > >>> s.bind((socket.VMADDR_CID_ANY, 4242)) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > OSError: [Errno 98] Address already in use > > > With the patch applied, the setsockopt() fails immediately, but the > bind() behavior is the same (fails only on the second): > > $ uname -r > 6.12.9-200.fc41.x86_64 > $ python3 > >>> import socket > >>> import os > >>> s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM) > >>> s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) > Traceback (most recent call last): > File "<python-input-3>", line 1, in <module> > s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) > ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > OSError: [Errno 95] Operation not supported > > So, IMHO the patch is correct since AF_VSOCK never really supported > SO_REUSEPORT, so better to fail early. > > BTW I'm not sure what is happening on your side. > Could it be a problem in your code that uses SO_REUSEPORT > indiscriminately on AF_VSOCK, even though you then never bind on the > same port again? > > Thanks, > Stefano >