On 24.02.2021 11:35, Stefano Garzarella wrote: > On Wed, Feb 24, 2021 at 11:28:50AM +0300, Arseny Krasnov wrote: >> On 24.02.2021 11:23, Stefano Garzarella wrote: >>> On Wed, Feb 24, 2021 at 07:29:25AM +0300, Arseny Krasnov wrote: >>>> On 23.02.2021 17:50, Stefano Garzarella wrote: >>>>> On Mon, Feb 22, 2021 at 03:23:11PM +0100, Stefano Garzarella wrote: >>>>>> Hi Arseny, >>>>>> >>>>>> On Thu, Feb 18, 2021 at 08:33:44AM +0300, Arseny Krasnov wrote: >>>>>>> This patchset impelements support of SOCK_SEQPACKET for virtio >>>>>>> transport. >>>>>>> As SOCK_SEQPACKET guarantees to save record boundaries, so to >>>>>>> do it, two new packet operations were added: first for start of record >>>>>>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also, >>>>>>> both operations carries metadata - to maintain boundaries and payload >>>>>>> integrity. Metadata is introduced by adding special header with two >>>>>>> fields - message count and message length: >>>>>>> >>>>>>> struct virtio_vsock_seq_hdr { >>>>>>> __le32 msg_cnt; >>>>>>> __le32 msg_len; >>>>>>> } __attribute__((packed)); >>>>>>> >>>>>>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END >>>>>>> packets(buffer of second virtio descriptor in chain) in the same way as >>>>>>> data transmitted in RW packets. Payload was chosen as buffer for this >>>>>>> header to avoid touching first virtio buffer which carries header of >>>>>>> packet, because someone could check that size of this buffer is equal >>>>>>> to size of packet header. To send record, packet with start marker is >>>>>>> sent first(it's header contains length of record and counter), then >>>>>>> counter is incremented and all data is sent as usual 'RW' packets and >>>>>>> finally SEQ_END is sent(it also carries counter of message, which is >>>>>>> counter of SEQ_BEGIN + 1), also after sedning SEQ_END counter is >>>>>>> incremented again. On receiver's side, length of record is known from >>>>>>> packet with start record marker. To check that no packets were dropped >>>>>>> by transport, counters of two sequential SEQ_BEGIN and SEQ_END are >>>>>>> checked(counter of SEQ_END must be bigger that counter of SEQ_BEGIN by >>>>>>> 1) and length of data between two markers is compared to length in >>>>>>> SEQ_BEGIN header. >>>>>>> Now as packets of one socket are not reordered neither on >>>>>>> vsock nor on vhost transport layers, such markers allows to restore >>>>>>> original record on receiver's side. If user's buffer is smaller that >>>>>>> record length, when all out of size data is dropped. >>>>>>> Maximum length of datagram is not limited as in stream socket, >>>>>>> because same credit logic is used. Difference with stream socket is >>>>>>> that user is not woken up until whole record is received or error >>>>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags. >>>>>>> Tests also implemented. >>>>>> I reviewed the first part (af_vsock.c changes), tomorrow I'll review >>>>>> the rest. That part looks great to me, only found a few minor issues. >>>>> I revieiwed the rest of it as well, left a few minor comments, but I >>>>> think we're well on track. >>>>> >>>>> I'll take a better look at the specification patch tomorrow. >>>> Great, Thank You >>>>> Thanks, >>>>> Stefano >>>>> >>>>>> In the meantime, however, I'm getting a doubt, especially with regard >>>>>> to other transports besides virtio. >>>>>> >>>>>> Should we hide the begin/end marker sending in the transport? >>>>>> >>>>>> I mean, should the transport just provide a seqpacket_enqueue() >>>>>> callbacl? >>>>>> Inside it then the transport will send the markers. This is because >>>>>> some transports might not need to send markers. >>>>>> >>>>>> But thinking about it more, they could actually implement stubs for >>>>>> that calls, if they don't need to send markers. >>>>>> >>>>>> So I think for now it's fine since it allows us to reuse a lot of >>>>>> code, unless someone has some objection. >>>> I thought about that, I'll try to implement it in next version. Let's see... >>> If you want to discuss it first, write down the idea you want to >>> implement, I wouldn't want to make you do unnecessary work. :-) >> Idea is simple, in iov iterator of 'struct msghdr' which is passed to >> >> enqueue callback we have two fields: 'iov_offset' which is byte >> >> offset inside io vector where next data must be picked and 'count' >> >> which is rest of unprocessed bytes in io vector. So in seqpacket >> >> enqueue callback if 'iov_offset' is 0 i'll send SEQBEGIN, and if >> >> 'count' is 0 i'll send SEQEND. >> > Got it, make sense and it's defently more transparent for the vsock > core! > Go head, maybe adding a comment in the vsock core explaining this, so > other developers can understand better if they want to support SEPACKET > in other transports. Ack > > Thanks, > Stefano > >