Re: How to implement message forwarding from one CID to another in vhost driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Stefano,
Apart from my questions in my previous email, I have some others as well.

If the vhost-device-vsock modification to forward packets to
VMADDR_CID_LOCAL is implemented, does the VMADDR_FLAG_TO_HOST need to
be set by any application in the guest? I understand that the flag is
set automatically in the listen path by the driver (ref:
https://patchwork.ozlabs.org/project/netdev/patch/20201204170235.84387-4-andraprs@xxxxxxxxxx/#2594117
), but from the comments in the referenced patch, I am guessing the
applications in the guest that will "connect" (as opposed to listen)
will need to set the flag in the application code? So does the
VMADDR_FLAG_TO_HOST flag need to be set by the applications in the
guest that will "connect" or should it work without it? I am asking
because the nitro-enclave VMs have an "init" which tries to connect to
CID 3 to send a "hello" on boot to let the parent VM know that it
booted expecting a "hello" reply but the init doesn't seem to set the
flag https://github.com/aws/aws-nitro-enclaves-sdk-bootstrap/blob/main/init/init.c#L356C1-L361C7
.

I was following
https://github.com/rust-vmm/vhost-device/tree/main/vhost-device-vsock#sibling-vm-communication
to test if sibling communication works and it seems like I didn't need
to modify the "socat" to set the "VMADDR_FLAG_TO_HOST". I am wondering
why it works without any modification. Here is what I do:

shell1: ./vhost-device-vsock --vm
guest-cid=3,uds-path=/tmp/vm3.vsock,socket=/tmp/vhost3.socket --vm
guest-cid=4,uds-path=/tmp/vm4.vsock,socket=/tmp/vhost4.socket

shell2: ./qemu-system-x86_64 -machine q35,memory-backend=mem0
-enable-kvm -m 8G -nic user,model=virtio -drive
file=/home/dorjoy/Forks/test_vm/fedora2.qcow2,media=disk,if=virtio
--display sdl -object memory-backend-memfd,id=mem0,size=8G -chardev
socket,id=char0,reconnect=0,path=/tmp/vhost3.socket -device
vhost-user-vsock-pci,chardev=char0
    inside this guest I run: socat - VSOCK-LISTEN:9000

shell3: ./qemu-system-x86_64 -machine q35,memory-backend=mem0
-enable-kvm -m 8G -nic user,model=virtio -drive
file=/home/dorjoy/Forks/test_vm/fedora40.qcow2,media=disk,if=virtio
--display sdl -object memory-backend-memfd,id=mem0,size=8G -chardev
socket,id=char0,reconnect=0,path=/tmp/vhost4.socket -device
vhost-user-vsock-pci,chardev=char0
    inside this guest I run: socat - VSOCK-CONNECT:3:9000

Then when I type something in the socat terminal of one VM and hit
'enter', they pop up in the socat terminal of the other VM. From the
documentation of the vhost-device-vsock, I thought I would need to
patch socat to set the "VMADDR_FLAG_TO_HOST" but I did not do anything
with socat. I simply did "sudo dnf install socat" in both VMs. I also
looked into the socat source code and I didn't see any reference to
"VMADDR_FLAG_TO_HOST". I am running "Fedora 40" on both VMs. Do you
know why it works without the flag?

On Wed, Jun 26, 2024 at 11:43 PM Dorjoy Chowdhury
<dorjoychy111@xxxxxxxxx> wrote:
>
> Hey Stefano,
> Thanks a lot for all the details. I will look into them and reach out
> if I need further input. Thanks! I have tried to summarize my
> understanding below. Let me know if that sounds correct.
>
> On Wed, Jun 26, 2024 at 2:37 PM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote:
> >
> > Hi Dorjoy,
> >
> > On Tue, Jun 25, 2024 at 11:44:30PM GMT, Dorjoy Chowdhury wrote:
> > >Hey Stefano,
> >
> > [...]
> >
> > >> >
> > >> >So the immediate plan would be to:
> > >> >
> > >> >  1) Build a new vhost-vsock-forward object model that connects to
> > >> >vhost as CID 3 and then forwards every packet from CID 1 to the
> > >> >Enclave-CID and every packet that arrives on to CID 3 to CID 2.
> > >>
> > >> This though requires writing completely from scratch the virtio-vsock
> > >> emulation in QEMU. If you have time that would be great, otherwise if
> > >> you want to do a PoC, my advice is to start with vhost-user-vsock which
> > >> is already there.
> > >>
> > >
> > >Can you give me some more details about how I can implement the
> > >daemon?
> >
> > We already have a demon written in Rust, so I don't recommend you
> > rewrite one from scratch, just start with that. You can find the daemon
> > and instructions on how to use it with QEMU here [1].
> >
> > >I would appreciate some pointers to code too.
> >
> > I sent the pointer to it in my first reply [2].
> >
> > >
> > >Right now, the "nitro-enclave" machine type (wip) in QEMU
> > >automatically spawns a VHOST_VSOCK device with the CID equal to the
> > >"guest-cid" machine option. I think this is equivalent to using the
> > >"-device vhost-vsock-device,guest-cid=N" option explicitly. Does that
> > >need any change? I guess instead of "vhost-vsock-device", the
> > >vhost-vsock device needs to be equivalent to "-device
> > >vhost-user-vsock-device,guest-cid=N"?
> >
> > Nope, the vhost-user-vsock device requires just a `chardev` option.
> > The chardev points to the Unix socket used by QEMU to talk with the
> > daemon. The daemon has a parameter to set the CID. See [1] for the
> > examples.
> >
> > >
> > >The applications inside the nitro-enclave VM will still connect and
> > >talk to CID 3. So on the daemon side, do we need to spawn a device
> > >that has CID 3 and then forward everything this device receives to CID
> > >1 (VMADDR_CID_LOCAL) same port and everything it receives from CID 1
> > >to the "guest-cid"?
> >
> > Yep, I think this is right.
> > Note: to use VMADDR_CID_LOCAL, the host needs to load `vsock_loopback`
> > kernel module.
> >
> > Before modifying the code, if you want to do some testing, perhaps you
> > can use socat (which supports both UNIX-* and VSOCK-*). The daemon for
> > now exposes two unix sockets, one is used to communicate with QEMU via
> > the vhost-user protocol, and the other is to be used by the application
> > to communicate with vsock sockets in the guest using the hybrid protocol
> > defined by firecracker. So you could initiate a socat between the latter
> > and VMADDR_CID_LOCAL, the only problem I see is that you have to send
> > the first string provided by the hybrid protocol (CONNECT 1234), but for
> > a PoC it should be ok.
> >
> > I just tried the following and it works without touching any code:
> >
> > shell1$ ./target/debug/vhost-device-vsock \
> >      --vm guest-cid=3,socket=/tmp/vhost3.socket,uds-path=/tmp/vm3.vsock
> >
> > shell2$ sudo modprobe vsock_loopback
> > shell2$ socat VSOCK-LISTEN:1234 UNIX-CONNECT:/tmp/vm3.vsock
> >
> > shell3$ qemu-system-x86_64 -smp 2 -M q35,accel=kvm,memory-backend=mem \
> >      -drive file=fedora40.qcow2,format=qcow2,if=virtio\
> >      -chardev socket,id=char0,path=/tmp/vhost3.socket \
> >      -device vhost-user-vsock-pci,chardev=char0 \
> >      -object memory-backend-memfd,id=mem,size=512M \
> >      -nographic
> >
> >      guest$ nc --vsock -l 1234
> >
> > shell4$ nc --vsock 1 1234
> > CONNECT 1234
> >
> >      Note: the `CONNECT 1234` is required by the hybrid vsock protocol
> >      defined by firecracker, so if we extend the vhost-device-vsock
> >      daemon to forward packet to VMADDR_CID_LOCAL, that would not be
> >      needed (including running socat).
> >
>
> Understood. Just trying to think out loud what the final UX will be
> from the user perspective to successfully run a nitro VM before I try
> to modify vhost-device-vsock to support forwarding to
> VMADDR_CID_LOCAL.
> I guess because the "vhost-user-vsock" device needs to be spawned
> implicitly (without any explicit option) inside nitro-enclave in QEMU,
> we now need to provide the "chardev" as a machine option, so the
> nitro-enclave command would look something like below:
> "./qemu-system-x86_64 -M nitro-enclave,chardev=char0 -kernel
> /path/to/eif -chardev socket,id=char0,path=/tmp/vhost5.socket -m 4G
> --enable-kvm -cpu host"
> and then set the chardev id to the vhost-user-vsock device in the code
> from the machine option.
>
> The modified "vhost-device-vsock" would need to be run with the new
> option that will forward everything to VMADDR_CID_LOCAL (below by the
> "-z" I mean the new option)
> "./target/debug/vhost-device-vsock -z --vm
> guest-cid=5,socket=/tmp/vhost5.socket,uds-path=/tmp/vm5.vsock"
> this means the guest-cid of the nitro VM is CID 5, right?
>
> And the applications in the host would need to use VMADDR_CID_LOCAL
> for communication instead of "guest-cid" (5) (assuming vsock_loopback
> is modprobed). Let's say there are 2 applications inside the nitro VM
> that connect to CID 3 on port 9000 and 9001. And the applications on
> the host listen on 9000 and 9001 using VMADDR_CID_LOCAL. So, after the
> commands above (qemu VM and vhost-device-vsock) are run, the
> communication between the applications in the host and the
> applications in the nitro VM on port 9000 and 9001 should just work,
> right, without needing to run any extra socat commands or such? or
> will the user still need to run some socat commands for all the
> relevant ports (e.g.,9000 and 9001)?
>
> I am just wondering what kind of changes are needed in
> vhost-device-vsock for forwarding packets to VMADDR_CID_LOCAL? Will
> that be something like this: the codepath that handles
> "/tmp/vm5.vsock", upon receiving a "connect" (from inside the nitro
> VM) for any port to "/tmp/vm5.vsock", vhost-device-vsock will just
> connect to the same port using AF_VSOCK using the socket system calls
> and messages received on that port in "/tmp/vm5.vsock" will be "send"
> to the AF_VSOCK socket? or am I not thinking right and the
> implementation would be something different entirely (change the CID
> from 3 to 2 (or 1?) on the packets before they are handled then socat
> will be needed probably)? Will this work if the applications in the
> host want to connect to applications inside the nitro VM (as opposed
> to applications inside the nitro VM connecting to CID 3)?
>
> Thanks and Regards,
> Dorjoy





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux