Re: [RFC v2] virtio-vsock: add description for datagram type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 14, 2021 at 08:57:06AM +0200, Stefano Garzarella wrote:
> On Tue, Apr 13, 2021 at 03:58:34PM -0400, Michael S. Tsirkin wrote:
> > On Tue, Apr 13, 2021 at 04:03:51PM +0200, Stefano Garzarella wrote:
> > > On Tue, Apr 13, 2021 at 09:50:45AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, Apr 13, 2021 at 03:38:52PM +0200, Stefano Garzarella wrote:
> > > > > On Tue, Apr 13, 2021 at 09:16:50AM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, Apr 13, 2021 at 02:58:53PM +0200, Stefano Garzarella wrote:
> > > > > > > On Mon, Apr 12, 2021 at 03:42:23PM -0700, Jiang Wang . wrote:
> > > > > > > > On Mon, Apr 12, 2021 at 7:21 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Apr 12, 2021 at 02:50:17PM +0100, Stefan Hajnoczi wrote:
> > > > > > > > > >On Thu, Apr 01, 2021 at 04:36:02AM +0000, jiang.wang wrote:
> > > > > > > > > >> Add supports for datagram type for virtio-vsock. Datagram
> > > > > > > > > >> sockets are connectionless and unreliable. To avoid contention
> > > > > > > > > >> with stream and other sockets, add two more virtqueues and
> > > > > > > > > >> a new feature bit to identify if those two new queues exist or not.
> > > > > > > > > >>
> > > > > > > > > >> Also add descriptions for resource management of datagram, which
> > > > > > > > > >> does not use the existing credit update mechanism associated with
> > > > > > > > > >> stream sockets.
> > > > > > > > > >>
> > > > > > > > > >> Signed-off-by: Jiang Wang <jiang.wang@xxxxxxxxxxxxx>
> > > > > > > > > >> ---
> > > > > > > > > >> V2 addressed the comments for the previous version.
> > > > > > > > > >>
> > > > > > > > > >>  virtio-vsock.tex | 62 +++++++++++++++++++++++++++++++++++++++++++++++---------
> > > > > > > > > >>  1 file changed, 52 insertions(+), 10 deletions(-)
> > > > > > > > > >>
> > > > > > > > > >> diff --git a/virtio-vsock.tex b/virtio-vsock.tex
> > > > > > > > > >> index da7e641..62c12e0 100644
> > > > > > > > > >> --- a/virtio-vsock.tex
> > > > > > > > > >> +++ b/virtio-vsock.tex
> > > > > > > > > >> @@ -11,12 +11,25 @@ \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
> > > > > > > > > >>  \begin{description}
> > > > > > > > > >>  \item[0] rx
> > > > > > > > > >>  \item[1] tx
> > > > > > > > > >> +\item[2] datagram rx
> > > > > > > > > >> +\item[3] datagram tx
> > > > > > > > > >> +\item[4] event
> > > > > > > > > >> +\end{description}
> > > > > > > > > >> +The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is set. Otherwise, it
> > > > > > > > > >> +only uses 3 queues, as the following. Rx and tx queues are always used for stream sockets.
> > > > > > > > > >> +
> > > > > > > > > >> +\begin{description}
> > > > > > > > > >> +\item[0] rx
> > > > > > > > > >> +\item[1] tx
> > > > > > > > > >>  \item[2] event
> > > > > > > > > >>  \end{description}
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >I suggest renaming "rx" and "tx" to "stream rx" and "stream tx"
> > > > > > > > > >virtqueues and also adding this:
> > > > > > > > > >
> > > > > > > > > >  When behavior differs between stream and datagram rx/tx virtqueues
> > > > > > > > > >  their full names are used. Common behavior is simply described in
> > > > > > > > > >  terms of rx/tx virtqueues and applies to both stream and datagram
> > > > > > > > > >  virtqueues.
> > > > > > > > > >
> > > > > > > > > >This way you won't need to duplicate portions of the spec that deal with
> > > > > > > > > >populating the virtqueues, for example.
> > > > > > > > > >
> > > > > > > > > >It's also clearer to use a full name for stream rx/tx virtqueues instead
> > > > > > > > > >of calling them rx/tx virtqueues now that we have datagram rx/tx
> > > > > > > > > >virtqueues.
> > > > > > > > > >
> > > > > > > > > >> +
> > > > > > > > > >>  \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > > > > > > > > >>
> > > > > > > > > >> -There are currently no feature bits defined for this device.
> > > > > > > > > >> +\begin{description}
> > > > > > > > > >> +\item[VIRTIO_VSOCK_F_DGRAM (0)] Device has support for datagram socket type.
> > > > > > > > > >> +\end{description}
> > > > > > > > > >>
> > > > > > > > > >>  \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout}
> > > > > > > > > >>
> > > > > > > > > >> @@ -107,6 +120,9 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> > > > > > > > > >>
> > > > > > > > > >>  \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device / Device Operation / Virtqueue Flow Control}
> > > > > > > > > >>
> > > > > > > > > >> +Flow control applies to stream sockets; datagram sockets do not have
> > > > > > > > > >> +flow control.
> > > > > > > > > >> +
> > > > > > > > > >>  The tx virtqueue carries packets initiated by applications and replies to
> > > > > > > > > >>  received packets.  The rx virtqueue carries packets initiated by the device and
> > > > > > > > > >>  replies to previously transmitted packets.
> > > > > > > > > >> @@ -140,12 +156,15 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera
> > > > > > > > > >>  consists of a (cid, port number) tuple. The header fields used for this are
> > > > > > > > > >>  \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
> > > > > > > > > >>
> > > > > > > > > >> -Currently only stream sockets are supported. \field{type} is 1 for stream
> > > > > > > > > >> -socket types.
> > > > > > > > > >> +Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 for stream
> > > > > > > > > >> +socket types. \field{type} is 3 for dgram socket types.
> > > > > > > > > >>
> > > > > > > > > >>  Stream sockets provide in-order, guaranteed, connection-oriented delivery
> > > > > > > > > >>  without message boundaries.
> > > > > > > > > >>
> > > > > > > > > >> +Datagram sockets provide connectionless unreliable messages of
> > > > > > > > > >> +a fixed maximum length.
> > > > > > > > > >
> > > > > > > > > >Plus unordered (?) and with message boundaries. In other words:
> > > > > > > > > >
> > > > > > > > > >  Datagram sockets provide unordered, unreliable, connectionless message
> > > > > > > > > >  with message boundaries and a fixed maximum length.
> > > > > > > > > >
> > > > > > > > > >I didn't think of the fixed maximum length aspect before. I guess the
> > > > > > > > > >intention is that the rx buffer size is the message size limit? That's
> > > > > > > > > >different from UDP messages, which can be fragmented into multiple IP
> > > > > > > > > >packets and can be larger than 64KiB:
> > > > > > > > > >https://en.wikipedia.org/wiki/User_Datagram_Protocol#UDP_datagram_structure
> > > > > > > > > >
> > > > > > > > > >Is it possible to support large datagram messages in vsock? I'm a little
> > > > > > > > > >concerned that applications that run successfully over UDP will not be
> > > > > > > > > >portable if vsock has this limitation because it would impose extra
> > > > > > > > > >message boundaries that the application protocol might not tolerate.
> > > > > > > > >
> > > > > > > > > Maybe we can reuse the same approach Arseny is using for SEQPACKET.
> > > > > > > > > Fragment the packets according to the buffers in the virtqueue and set
> > > > > > > > > the EOR flag to indicate the last packet in the message.
> > > > > > > > >
> > > > > > > > Agree. Another option is to use the ones for skb since we may need to
> > > > > > > > use skbs for multiple transport support anyway.
> > > > > > > >
> > > > > > >
> > > > > > > The important thing I think is to have a single flag in virtio-vsock that
> > > > > > > identifies pretty much the same thing: this is the last fragment of a series
> > > > > > > to rebuild a packet.
> > > > > > >
> > > > > > > We should reuse the same flag for DGRAM and SEQPACKET.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Stefano
> > > > > >
> > > > > > Well DGRAM can drop data so I wonder whether it can work ...
> > > > > >
> > > > >
> > > > > Yep, this is true, but the channel should not be losing packets, so if the
> > > > > receiver discards packets, it knows that it must then discard all of them
> > > > > until the EOR.
> > > >
> > > > That is not so easy - they can come mixed up from multiple sources.
> > > 
> > > I think we can prevent mixing because virtuqueue is point to point and its
> > > use is not thread safe, so the access (in the same peer) is already
> > > serialized.
> > > In the end the packet would be fragmented only before copying it to the
> > > virtuqueue.
> > > 
> > > But maybe I missed something...
> > 
> > Well I ask what's the point of fragmenting then. I assume it's so we
> > can pass huge messages around so you can't keep locks ...
> > 
> 
> Maybe I'm wrong, but isn't this similar to what we do in virtio-net with
> mergeable buffers?

The point of mergeable buffers is to use less memory: both for each
packet and for a full receive vq.

> Also in this case I think the fragmentation will happen only in the device,
> since the driver can enqueue the entire buffer.
> 
> Maybe we can reuse mergeable buffers for virtio-vsock if the EOR flag is not
> suitable.

That sounds very reasonable.

> IIUC in the vsock device the fragmentation for DGRAM will happen just before
> to queue it in the virtqueue, and the device can check how many buffers are
> available in the queue and it can decide whether to queue them all up or
> throw them away.
> > 
> > > > Sure linux net core does this but with fragmentation added in,
> > > > I start wondering whether you are beginning to reinvent the net stack
> > > > ...
> > > 
> > > No, I hope not :-), in the end our advantage is that we have a channel that
> > > doesn't lose packets, so I guess we can make assumptions that the network
> > > stack can't.
> > > 
> > > Thanks,
> > > Stefano
> > 
> > I still don't know how will credit accounting work for datagram,
> > but proposals I saw seem to actually lose packets ...
> > 
> 
> I still don't know too, but I think it's not an issue in the RX side,
> since if it doesn't have space, can drop all the fragment.
> 
> Another option to avoid fragmentation could be to allocate 64K buffers for
> the new DGRAM virtqueue.

That's a lot of buffers ...

> In this way we will have at most 64K packets, which is similar to UDP/IP,
> without extra work for the fragmentation.

IIRC default MTU is 1280 not 64K ...

> 
> Thanks,
> Stefano

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization



[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux