Re: [REGRESSION] vsocket timeout with kata containers agent 3.2.0 and kernel 6.1.63

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Greg, Stefano,

tldr; withdrawing the regression -- rust-vmm vsock mistake
--

We're not strictly tied to the 6.1.x tree but generally stick with the
long term releases because we patch every week or so and want less to
change if possible.

I think you're exactly right re: rust-vmm's vsock. We're using cloud
hypervisor and just tried updating to a fixed version and everything
is working as expected.
https://github.com/rust-vmm/vm-virtio/issues/204 (thanks Stefano)

Thanks all... nothing to see.
- Simon

On Mon, Dec 11, 2023 at 3:39 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote:
>
> On Mon, Dec 11, 2023 at 5:05 AM Simon Kaegi <simon.kaegi@xxxxxxxxx> wrote:
> >
> > #regzbot introduced v6.1.62..v6.1.63
> > #regzbot introduced: baddcc2c71572968cdaeee1c4ab3dc0ad90fa765
> >
> > We hit this regression when updating our guest vm kernel from 6.1.62 to
> > 6.1.63 -- bisecting, this problem was introduced
> > in baddcc2c71572968cdaeee1c4ab3dc0ad90fa765 -- virtio/vsock: replace
> > virtio_vsock_pkt with sk_buff --
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.63&id=baddcc2c71572968cdaeee1c4ab3dc0ad90fa765
> >
> > We're getting a timeout when trying to connect to the vsocket in the
> > guest VM when launching a kata containers 3.2.0 agent. We haven't done
> > much more to understand the problem at this point.
>
> It looks like the same issue described here:
> https://github.com/rust-vmm/vm-virtio/issues/204
>
> In summary that patch also contains a performance improvement, because
> by switching to sk_buffs, we can use only one descriptor for the whole
> packet (header + payload), whereas before we used two for each packet.
> Some devices (e.g. rust-vmm's vsock) mistakenly always expect 2
> descriptors, but this is a violation of the VIRTIO specification.
>
> Which device are you using?
>
> Can you confirm that your device conforms to the specification?
>
> Stefano
>
> >
> > We can reproduce 100% of the time but don't currently have a simple
> > reproducer as the problem was found in our build service which uses
> > kata-containers (with cloud-hypervisor).
> >
> > We have not checked the mainline as we currently are tied to 6.1.x.
> >
> > -Simon
> >
>





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux