Hi Arseny, On Tue, Mar 23, 2021 at 04:07:13PM +0300, Arseny Krasnov wrote:
This patchset implements support of SOCK_SEQPACKET for virtio transport. As SOCK_SEQPACKET guarantees to save record boundaries, so to do it, two new packet operations were added: first for start of record and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also, both operations carries metadata - to maintain boundaries and payload integrity. Metadata is introduced by adding special header with two fields - message id and message length: struct virtio_vsock_seq_hdr { __le32 msg_id; __le32 msg_len; } __attribute__((packed)); This header is transmitted as payload of SEQ_BEGIN and SEQ_END packets(buffer of second virtio descriptor in chain) in the same way as data transmitted in RW packets. Payload was chosen as buffer for this header to avoid touching first virtio buffer which carries header of packet, because someone could check that size of this buffer is equal to size of packet header. To send record, packet with start marker is sent first(it's header carries length of record and id),then all data is sent as usual 'RW' packets and finally SEQ_END is sent(it carries id of message, which is equal to id of SEQ_BEGIN), also after sending SEQ_END id is incremented. On receiver's side,size of record is known from packet with start record marker. To check that no packets were dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END are checked to be equal and length of data between two markers is compared to then length in SEQ_BEGIN header. Now as packets of one socket are not reordered neither on vsock nor on vhost transport layers, such markers allows to restore original record on receiver's side. If user's buffer is smaller that record length, when all out of size data is dropped. Maximum length of datagram is not limited as in stream socket, because same credit logic is used. Difference with stream socket is that user is not woken up until whole record is received or error occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags. Tests also implemented. Thanks to stsp2@xxxxxxxxx for encouragements and initial design recommendations. Arseny Krasnov (22): af_vsock: update functions for connectible socket af_vsock: separate wait data loop af_vsock: separate receive data loop af_vsock: implement SEQPACKET receive loop af_vsock: separate wait space loop af_vsock: implement send logic for SEQPACKET af_vsock: rest of SEQPACKET support af_vsock: update comments for stream sockets virtio/vsock: set packet's type in virtio_transport_send_pkt_info() virtio/vsock: simplify credit update function API virtio/vsock: dequeue callback for SOCK_SEQPACKET virtio/vsock: fetch length for SEQPACKET record virtio/vsock: add SEQPACKET receive logic virtio/vsock: rest of SOCK_SEQPACKET support virtio/vsock: SEQPACKET support feature bit virtio/vsock: setup SEQPACKET ops for transport vhost/vsock: setup SEQPACKET ops for transport vsock/loopback: setup SEQPACKET ops for transport vhost/vsock: SEQPACKET feature bit support virtio/vsock: SEQPACKET feature bit support vsock_test: add SOCK_SEQPACKET tests virtio/vsock: update trace event for SEQPACKET drivers/vhost/vsock.c | 21 +- include/linux/virtio_vsock.h | 21 + include/net/af_vsock.h | 9 + .../events/vsock_virtio_transport_common.h | 48 +- include/uapi/linux/virtio_vsock.h | 19 + net/vmw_vsock/af_vsock.c | 581 +++++++++++------ net/vmw_vsock/virtio_transport.c | 17 + net/vmw_vsock/virtio_transport_common.c | 379 +++++++++-- net/vmw_vsock/vsock_loopback.c | 12 + tools/testing/vsock/util.c | 32 +- tools/testing/vsock/util.h | 3 + tools/testing/vsock/vsock_test.c | 126 ++++ 12 files changed, 1015 insertions(+), 253 deletions(-) v6 -> v7: General changelog: - virtio transport callback for message length now removed from transport. Length of record is returned by dequeue callback. - function which tries to get message length now returns 0 when rx queue is empty. Also length of current message in progress is set to 0, when message processed or error happens. - patches for virtio feature bit moved after patches with transport ops. Per patch changelog: see every patch after '---' line.
I reviewed the series and I left some comments, I think we are at a good point, but we should have the specification accepted before merging this series to avoid having to change the implementation later.
What do you think? Thanks, Stefano