On 23.02.2021 17:50, Stefano Garzarella wrote: > On Mon, Feb 22, 2021 at 03:23:11PM +0100, Stefano Garzarella wrote: >> Hi Arseny, >> >> On Thu, Feb 18, 2021 at 08:33:44AM +0300, Arseny Krasnov wrote: >>> This patchset impelements support of SOCK_SEQPACKET for virtio >>> transport. >>> As SOCK_SEQPACKET guarantees to save record boundaries, so to >>> do it, two new packet operations were added: first for start of record >>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also, >>> both operations carries metadata - to maintain boundaries and payload >>> integrity. Metadata is introduced by adding special header with two >>> fields - message count and message length: >>> >>> struct virtio_vsock_seq_hdr { >>> __le32 msg_cnt; >>> __le32 msg_len; >>> } __attribute__((packed)); >>> >>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END >>> packets(buffer of second virtio descriptor in chain) in the same way as >>> data transmitted in RW packets. Payload was chosen as buffer for this >>> header to avoid touching first virtio buffer which carries header of >>> packet, because someone could check that size of this buffer is equal >>> to size of packet header. To send record, packet with start marker is >>> sent first(it's header contains length of record and counter), then >>> counter is incremented and all data is sent as usual 'RW' packets and >>> finally SEQ_END is sent(it also carries counter of message, which is >>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is >>> incremented again. On receiver's side, length of record is known from >>> packet with start record marker. To check that no packets were dropped >>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are >>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by >>> 1) and length of data between two markers is compared to length in >>> SEQ_BEGIN header. >>> Now as packets of one socket are not reordered neither on >>> vsock nor on vhost transport layers, such markers allows to restore >>> original record on receiver's side. If user's buffer is smaller that >>> record length, when all out of size data is dropped. >>> Maximum length of datagram is not limited as in stream socket, >>> because same credit logic is used. Difference with stream socket is >>> that user is not woken up until whole record is received or error >>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags. >>> Tests also implemented. >> I reviewed the first part (af_vsock.c changes), tomorrow I'll review >> the rest. That part looks great to me, only found a few minor issues. > I revieiwed the rest of it as well, left a few minor comments, but I > think we're well on track. > > I'll take a better look at the specification patch tomorrow. Great, Thank You > > Thanks, > Stefano > >> In the meantime, however, I'm getting a doubt, especially with regard >> to other transports besides virtio. >> >> Should we hide the begin/end marker sending in the transport? >> >> I mean, should the transport just provide a seqpacket_enqueue() >> callbacl? >> Inside it then the transport will send the markers. This is because >> some transports might not need to send markers. >> >> But thinking about it more, they could actually implement stubs for >> that calls, if they don't need to send markers. >> >> So I think for now it's fine since it allows us to reuse a lot of >> code, unless someone has some objection. I thought about that, I'll try to implement it in next version. Let's see... >> >> Thanks, >> Stefano >> >