Re: Re: [RFC v2] virtio-vsock: add description for datagram type

"Jiang Wang ." <jiang.wang@xxxxxxxxxxxxx> · Wed, 5 May 2021 09:58:24 -0700

On Wed, May 5, 2021 at 3:49 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote:
>
> On Tue, May 04, 2021 at 10:06:02AM -0700, Jiang Wang . wrote:
> >On Tue, May 4, 2021 at 9:16 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote:
> >>
> >> Hi Jiang,
> >>
> >> On Mon, May 03, 2021 at 08:40:46PM -0700, Jiang Wang . wrote:
> >> >Hi Stefano,
> >> >
> >> >I checked the VIRTIO_NET_F_MRG_RXBUF feature bit and I think vsock
> >> >dgram can use that feature too.
> >>
> >> Cool, thanks for checking!
> >
> >NP.
> >
> >> >Do we want to make this feature a must-have or optional? One idea is
> >> >to make it optional. When not
> >>
> >> I think optional is fine, and we should support it for all kind of
> >> traffic (stream, dgram, seqpacket).
> >
> >Got it. I was thinking only for dgram originally, but I think it should be fine
> >for stream and seqpacket too.
> >
> >Btw, I have a small implementation question. For now, the vsock
> >allocates rx buffers with two scatterlist. One for header and one for the
> >payload. After we enable VIRTIO_NET_F_MRG_RXBUF feature,
> >do we still want to allocate buffers like that? Or could we just use
> >one big scatterlist for the whole packet? I think using the same allocation
> >method is fine, but it just may not line up with the real packets well since
> >we will skip headers for the big packets except the first buffer.
>
> Good question.
>
> With mergeable buffer I think is better to remove the little buffer for
> the header in the scatterlist, this should also avoid to do two
> allocations per packet/buffer in the guest.

Got  it. Will do.

> >
> >> >supported, dgram rx buf is 16 KB which should be good in most cases.
> >>
> >> Why not 4 KB like for stream? Or we could make it configurable.
> >
> >OK. sure. 4 KB is fine with me. I mentioned 16 KB because I was thinking
> >jumbo frames in the ethernet world. But  I just found out the jumbo frame
> >is about 8 KB or 9 KB only.
> >
> >If we make it configurable, what kind of interface to use to configure it?
> >In linux, we could use something like the sysfs interface. I guess we don't
>
> Yes, something like that for the guest driver.

Got it.

> >need to specify that detail in the spec though. I will just put the size should
> >be configurable in the spec.
>
> Yeah, I remember that at some point we fixed an issue where the host
> always expected buffer of 4 KB.
>
> Now it should support any buffer sizes less or equal to 64 KB.
>
I see. I will if there is any issue with that.

> >
> >> >When VIRTIO_NET_F_MRG_RXBUF is supported, the rx buf is 4K and the max
> >> >packet size is 64 KB.
> >> >
> >> >Also, just to make sure we are on the same page, the current vsock
> >> >stream code can also split a
> >> >big packet to multiple buffers and the receive side can assemble them
> >> >together.
> >>
> >> Yes, sort of. Being a stream, there's no concept of a boundary.
> >>
> >> > But dgram cannot
> >> >use that code because the dgram may drop a buffer in the driver code
> >> >(if there is not enough space).
> >> >That means dgram may drop some buffers at the beginning, in the end or in the
> >> >middle of a pkt. And a packet may
> >> >not be received as a complete one. Therefore, we need something like
> >> >VIRTIO_NET_F_MRG_RXBUF.
> >>
> >> Yep.
> >>
> >> >
> >> >If we want to leverage current stream code without using
> >> >VIRTIO_NET_F_MRG_RXBUF,
> >> >we could add a total_len and offset to the virtio_vsock_hdr. Then when sending
> >> >packet, the device split the big packet to multiple small ones and
> >> >each has a header. They will have the
> >> >same total_len, but different offsets. On the driver side, the driver
> >> >can check the total_len before
> >> >enqueueing the big packet for the one with offset 0.
> >> >If there is enough space, all the remaining packets will be received.
> >> >If not, the remaining packets will be dropped.
> >> >I feel this implementation might be easier than using
> >> >VIRTIO_NET_F_MRG_RXBUF. But either one is fine with me.
> >> >Any preference? Thanks.
> >>
> >> This is very similar to what we discussed with Michael. He pointed
> >> out
> >> that it could be complicated and we could have several problems.
> >>
> >> For example, we should also provide an ID to prevent different
> >> fragments
> >> from overlapping. Also we might have problems handling different
> >> flows
> >> at the same time.
> >>
> >> Mergable buffers allow us to avoid these problems and also bring
> >> advantages for the other types of traffic (stream, seqpacket).
> >>
> >> It also allows us to use a single header for the packet and all its
> >> fragments.
> >>
> >> So IMHO, if there are no significant issues, the best way would be to
> >> implement mergeable buffers in vsock,
> >> I think there are only advantages to using this feature.
> >
> >Sure. Got it. I was thinking only about dgram, which is simpler than
> >stream and seqpacket. For those two, they will have issues as you
> >just mentioned.
> >
> >Also, just to make sure. For steam and seqpacket, supporting
> >mergeable buffers is mainly for performance improvements,
> >right? Or to save memory? I think functionally, they will be the
> >same with or without
> >mergeable buffers.
>
> Yes, right!
>
> > For dgram, the maximum supported packet size
> >is increased when using MRG_RXBUF if the rx buf size is fixed,
> >and it can save lots of memory.
> >
> >I am a little bit confused about the motivation to support mergeable
> >buffers for stream and seqpacket. Could you remind me again? Sorry
> >that if it was already mentioned in the old emails.
>
> We can save the header overhead, using a single header for the entire
> "big" packet.
>
> For example, in the current implementation, if the host has a 64KB
> buffer to send to the guest with a stream socket, must split it into 16
> packets, using a header for each fragment. With mergable buffers, we
> would save the extra header for each fragment by using a single initial
> header specifying the number of descriptors used.
>
OK. Sure.
> >
> >We could only support it on dgram since dgram has its own virtqueues.
>
> Maybe for the initial implementation is fine, then we can add support
> also for other types.
>
> Please, keep this in mind, so it will be easier to reuse it for other
> types.
>
Got it. Will do. Thanks for the suggestions and comments. I will
update the spec patch next.

> >
> >btw, my company email system automatically adds [External] to
> >these emails, and I meant to remove it manually when I reply,
> >but forgot to do that sometimes, so the email threading may not
> >be very accurate.
> >Sorry about that.
>
> Don't worry :-)
>
> Thanks,
> Stefano
>
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization