Thanks so much Stefano and to Moritz Sanft https://github.com/containerd/ttrpc-rust/pull/280 We've rebuilt and everything is again working as expected - all resolved. Thanks again everyone -Simon On Wed, Jan 22, 2025 at 6:12 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > > On Wed, 22 Jan 2025 at 10:23, Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > > > > CCing Ruoqing He > > > > On Wed, 22 Jan 2025 at 04:48, Simon Kaegi <simon.kaegi@xxxxxxxxx> wrote: > > > > > > Thanks Stefano, > > > > > > The feedback about vsock expectations was exactly what I was hoping > > > you could provide. > > > > You're welcome ;-) > > > > > > > > In the Kata agent we're not directly setting SO_REUSEPORT as a socket > > > option so I think what you suggest where SO_REUSEORT is being set > > > indiscriminately is happening a layer down perhaps in the tokio or nix > > > crates we use. I unfortunately do not have an easy way to reproduce > > > the problem without setting up kata containers and what's more you > > > need to then rebuild a recent kata flavoured minimal kernel to see the > > > issue. > > > > I talked with Ruoqing He yesterday about this issue since he knows > > Kata better than me :-) > > > > He pointed out that Kata is using ttrpc-rust and he shared with me this code: > > https://github.com/containerd/ttrpc-rust/blob/0610015a92c340c6d88f81c0d6f9f449dfd0ecba/src/common.rs#L175 > > > > The change (setting SO_REUSEPORT) was introduced more than 4 years > > ago, but I honestly don't think it solved the problem mentioned in the > > commit: > > https://github.com/containerd/ttrpc-rust/commit/9ac87828ee870ecf5fb5feaa45cc0c9e3d34e236 > > So far it didn't give any problems because it was allowed on every > > socket, but effectively it was a NOP for AF_VSOCK. > > > > IIUC that code, it supports 2 address families: AF_VSOCK and AF_UNIX. > > For AF_VSOCK we've made it clear that SO_REUSEPORT is useless, but for > > AF_UNIX it's even more useless since there's no concept of a port, so > > in my opinion `setsockopt(fd, sockopt::ReusePort, &true)?;` can be > > removed completely. > > Or at least not fail the entire function if it's unsupported, whereas > > now it fails and the next bind is not done. > > > > I don't know where this code is called, but removing that line is > > likely to make everything work correctly. > > It looks like they already released a new version of ttrpc to fix it: > https://github.com/containerd/ttrpc-rust/pull/281 > > And Kata is updating its dependency: > https://github.com/kata-containers/kata-containers/pull/10775 > > I hope it will fix your issue! > > Stefano >