On Wed, May 5, 2021 at 3:49 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > > On Tue, May 04, 2021 at 10:06:02AM -0700, Jiang Wang . wrote: > >On Tue, May 4, 2021 at 9:16 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > >> > >> Hi Jiang, > >> > >> On Mon, May 03, 2021 at 08:40:46PM -0700, Jiang Wang . wrote: > >> >Hi Stefano, > >> > > >> >I checked the VIRTIO_NET_F_MRG_RXBUF feature bit and I think vsock > >> >dgram can use that feature too. > >> > >> Cool, thanks for checking! > > > >NP. > > > >> >Do we want to make this feature a must-have or optional? One idea is > >> >to make it optional. When not > >> > >> I think optional is fine, and we should support it for all kind of > >> traffic (stream, dgram, seqpacket). > > > >Got it. I was thinking only for dgram originally, but I think it should be fine > >for stream and seqpacket too. > > > >Btw, I have a small implementation question. For now, the vsock > >allocates rx buffers with two scatterlist. One for header and one for the > >payload. After we enable VIRTIO_NET_F_MRG_RXBUF feature, > >do we still want to allocate buffers like that? Or could we just use > >one big scatterlist for the whole packet? I think using the same allocation > >method is fine, but it just may not line up with the real packets well since > >we will skip headers for the big packets except the first buffer. > > Good question. > > With mergeable buffer I think is better to remove the little buffer for > the header in the scatterlist, this should also avoid to do two > allocations per packet/buffer in the guest. Got it. Will do. > > > >> >supported, dgram rx buf is 16 KB which should be good in most cases. > >> > >> Why not 4 KB like for stream? Or we could make it configurable. > > > >OK. sure. 4 KB is fine with me. I mentioned 16 KB because I was thinking > >jumbo frames in the ethernet world. But I just found out the jumbo frame > >is about 8 KB or 9 KB only. > > > >If we make it configurable, what kind of interface to use to configure it? > >In linux, we could use something like the sysfs interface. I guess we don't > > Yes, something like that for the guest driver. Got it. > >need to specify that detail in the spec though. I will just put the size should > >be configurable in the spec. > > Yeah, I remember that at some point we fixed an issue where the host > always expected buffer of 4 KB. > > Now it should support any buffer sizes less or equal to 64 KB. > I see. I will if there is any issue with that. > > > >> >When VIRTIO_NET_F_MRG_RXBUF is supported, the rx buf is 4K and the max > >> >packet size is 64 KB. > >> > > >> >Also, just to make sure we are on the same page, the current vsock > >> >stream code can also split a > >> >big packet to multiple buffers and the receive side can assemble them > >> >together. > >> > >> Yes, sort of. Being a stream, there's no concept of a boundary. > >> > >> > But dgram cannot > >> >use that code because the dgram may drop a buffer in the driver code > >> >(if there is not enough space). > >> >That means dgram may drop some buffers at the beginning, in the end or in the > >> >middle of a pkt. And a packet may > >> >not be received as a complete one. Therefore, we need something like > >> >VIRTIO_NET_F_MRG_RXBUF. > >> > >> Yep. > >> > >> > > >> >If we want to leverage current stream code without using > >> >VIRTIO_NET_F_MRG_RXBUF, > >> >we could add a total_len and offset to the virtio_vsock_hdr. Then when sending > >> >packet, the device split the big packet to multiple small ones and > >> >each has a header. They will have the > >> >same total_len, but different offsets. On the driver side, the driver > >> >can check the total_len before > >> >enqueueing the big packet for the one with offset 0. > >> >If there is enough space, all the remaining packets will be received. > >> >If not, the remaining packets will be dropped. > >> >I feel this implementation might be easier than using > >> >VIRTIO_NET_F_MRG_RXBUF. But either one is fine with me. > >> >Any preference? Thanks. > >> > >> This is very similar to what we discussed with Michael. He pointed > >> out > >> that it could be complicated and we could have several problems. > >> > >> For example, we should also provide an ID to prevent different > >> fragments > >> from overlapping. Also we might have problems handling different > >> flows > >> at the same time. > >> > >> Mergable buffers allow us to avoid these problems and also bring > >> advantages for the other types of traffic (stream, seqpacket). > >> > >> It also allows us to use a single header for the packet and all its > >> fragments. > >> > >> So IMHO, if there are no significant issues, the best way would be to > >> implement mergeable buffers in vsock, > >> I think there are only advantages to using this feature. > > > >Sure. Got it. I was thinking only about dgram, which is simpler than > >stream and seqpacket. For those two, they will have issues as you > >just mentioned. > > > >Also, just to make sure. For steam and seqpacket, supporting > >mergeable buffers is mainly for performance improvements, > >right? Or to save memory? I think functionally, they will be the > >same with or without > >mergeable buffers. > > Yes, right! > > > For dgram, the maximum supported packet size > >is increased when using MRG_RXBUF if the rx buf size is fixed, > >and it can save lots of memory. > > > >I am a little bit confused about the motivation to support mergeable > >buffers for stream and seqpacket. Could you remind me again? Sorry > >that if it was already mentioned in the old emails. > > We can save the header overhead, using a single header for the entire > "big" packet. > > For example, in the current implementation, if the host has a 64KB > buffer to send to the guest with a stream socket, must split it into 16 > packets, using a header for each fragment. With mergable buffers, we > would save the extra header for each fragment by using a single initial > header specifying the number of descriptors used. > OK. Sure. > > > >We could only support it on dgram since dgram has its own virtqueues. > > Maybe for the initial implementation is fine, then we can add support > also for other types. > > Please, keep this in mind, so it will be easier to reuse it for other > types. > Got it. Will do. Thanks for the suggestions and comments. I will update the spec patch next. > > > >btw, my company email system automatically adds [External] to > >these emails, and I meant to remove it manually when I reply, > >but forgot to do that sometimes, so the email threading may not > >be very accurate. > >Sorry about that. > > Don't worry :-) > > Thanks, > Stefano > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization