Re: [REGRESSION] vsocket timeout with kata containers agent 3.10.1 and kernel 6.6.70

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 22 Jan 2025 at 10:23, Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote:
>
> CCing Ruoqing He
>
> On Wed, 22 Jan 2025 at 04:48, Simon Kaegi <simon.kaegi@xxxxxxxxx> wrote:
> >
> > Thanks Stefano,
> >
> > The feedback about vsock expectations was exactly what I was hoping
> > you could provide.
>
> You're welcome ;-)
>
> >
> > In the Kata agent we're not directly setting SO_REUSEPORT as a socket
> > option so I think what you suggest where SO_REUSEORT is being set
> > indiscriminately is happening a layer down perhaps in the tokio or nix
> > crates we use. I unfortunately do not have an easy way to reproduce
> > the problem without setting up kata containers and what's more you
> > need to then rebuild a recent kata flavoured minimal kernel to see the
> > issue.
>
> I talked with Ruoqing He yesterday about this issue since he knows
> Kata better than me :-)
>
> He pointed out that Kata is using ttrpc-rust and he shared with me this code:
> https://github.com/containerd/ttrpc-rust/blob/0610015a92c340c6d88f81c0d6f9f449dfd0ecba/src/common.rs#L175
>
> The change (setting SO_REUSEPORT) was introduced more than 4 years
> ago, but I honestly don't think it solved the problem mentioned in the
> commit:
> https://github.com/containerd/ttrpc-rust/commit/9ac87828ee870ecf5fb5feaa45cc0c9e3d34e236
> So far it didn't give any problems because it was allowed on every
> socket, but effectively it was a NOP for AF_VSOCK.
>
> IIUC that code, it supports 2 address families: AF_VSOCK and AF_UNIX.
> For AF_VSOCK we've made it clear that SO_REUSEPORT is useless, but for
> AF_UNIX it's even more useless since there's no concept of a port, so
> in my opinion `setsockopt(fd, sockopt::ReusePort, &true)?;` can be
> removed completely.
> Or at least not fail the entire function if it's unsupported, whereas
> now it fails and the next bind is not done.
>
> I don't know where this code is called, but removing that line is
> likely to make everything work correctly.

It looks like they already released a new version of ttrpc to fix it:
https://github.com/containerd/ttrpc-rust/pull/281

And Kata is updating its dependency:
https://github.com/kata-containers/kata-containers/pull/10775

I hope it will fix your issue!

Stefano





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux