On Thu, Aug 05, 2021 at 12:21:57PM +0300, Arseny Krasnov wrote:
On 05.08.2021 12:06, Stefano Garzarella wrote:
Caution: This is an external email. Be cautious while opening links or attachments.
On Thu, Aug 05, 2021 at 11:33:12AM +0300, Arseny Krasnov wrote:
On 04.08.2021 15:57, Stefano Garzarella wrote:
Caution: This is an external email. Be cautious while opening links or attachments.
Hi Arseny,
On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote:
This patchset implements support of MSG_EOR bit for SEQPACKET
AF_VSOCK sockets over virtio transport.
Idea is to distinguish concepts of 'messages' and 'records'.
Message is result of sending calls: 'write()', 'send()', 'sendmsg()'
etc. It has fixed maximum length, and it bounds are visible using
return from receive calls: 'read()', 'recv()', 'recvmsg()' etc.
Current implementation based on message definition above.
Okay, so the implementation we merged is wrong right?
Should we disable the feature bit in stable kernels that contain it? Or
maybe we can backport the fixes...
Hi,
No, this is correct and it is message boundary based. Idea of this
patchset is to add extra boundaries marker which i think could be
useful when we want to send data in seqpacket mode which length
is bigger than maximum message length(this is limited by transport).
Of course we can fragment big piece of data too small messages, but
this
requires to carry fragmentation info in data protocol. So In this case
when we want to maintain boundaries receiver calls recvmsg() until
MSG_EOR found.
But when receiver knows, that data is fit in maximum datagram length,
it doesn't care about checking MSG_EOR just calling recv() or
read()(e.g.
message based mode).
I'm not sure we should maintain boundaries of multiple send(), from
POSIX standard [1]:
Yes, but also from POSIX: such calls like send() and sendmsg()
operates with "message" and if we check recvmsg() we will
find the following thing:
For message-based sockets, such as SOCK_DGRAM and SOCK_SEQPACKET, the entire
message shall be read in a single operation. If a message is too long to fit in the supplied
buffers, and MSG_PEEK is not set in the flags argument, the excess bytes shall be discarded.
I understand this, that send() boundaries also must be maintained.
I've checked SEQPACKET in AF_UNIX and AX_25 - both doesn't support
MSG_EOR, so send() boundaries must be supported.
SOCK_SEQPACKET
Provides sequenced, reliable, bidirectional, connection-mode
transmission paths for records. A record can be sent using one or
more output operations and received using one or more input
operations, but a single operation never transfers part of more than
one record. Record boundaries are visible to the receiver via the
MSG_EOR flag.
From my understanding a record could be sent with multiple send()
and
received, for example, with a single recvmsg().
The only boundary should be the MSG_EOR flag set by the user on the
last
send() of a record.
You are right, if we talking about "record".
From send() description [2]:
MSG_EOR
Terminates a record (if supported by the protocol).
From recvmsg() description [3]:
MSG_EOR
End-of-record was received (if supported by the protocol).
Thanks,
Stefano
[1]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html
[2]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html
[3]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html
P.S.: seems SEQPACKET is too exotic thing that everyone implements it
in
own manner, because i've tested SCTP seqpacket implementation, and
found
that:
1) It doesn't support MSG_EOR bit at send side, but uses MSG_EOR at
receiver
side to mark MESSAGE boundary.
2) According POSIX any extra bytes that didn't fit in user's buffer
must be dropped,
but SCTP doesn't drop it - you can read rest of datagram in next calls.
Thanks for this useful information, now I see the differences and why we
should support both.
I think is better to include them in the cover letter.
I'm going to review the paches right now :-)
Stefano
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization