On 15/01/2020 16.12, Radoslaw Zarzynski wrote:
On Wed, Jan 15, 2020 at 2:34 PM Avi Kivity <avi@xxxxxxxxxxxx> wrote:
This is what Seastar provides today and I agree it can should be improved.
I agree. Let's continue our discussion and try to find the way. :-)
What I propose it:
* WHEN you have a requirement for aligned buffers, use
"application-provided"
* WHEN you do not have a requirement for aligned buffers, use
"stack-provided"
After the applications starts, do you not know whether you have a
requirement for alignment or not?
We have the knowledge on alignment, so let's experiment with
the proposed ruleset to judge performance repercussions.
Today, when crimson-osd is all about the cyan store (simple,
RAM-backed store for testing), we can definitely say there is
no requirement for alignment. Basing on that and the rule:
* WHEN you do not have a requirement for aligned buffers, use
"stack-provided".
Therefore we should opt for "stack-provided". Let's verify
the result:
* if the actual stack is native, everything is OK. There will
be no even single memcpy, no syscall.
* if the actual stack is POSIX, as there is no provided buffer,
there is also no buffer.length. The stack needs to guess
how many bytes read() from the socket. If the guessed
number is too small, the application is hurt by excessive
syscalls. This happens today. :-(
Ok, so it's not just about alignment, but also about sizes. We can also
allow the application to specify how many bytes it wants to read (in
fact, it can already do that with read_exactly, but input_stream does
not pass the information along).
Let's list the possible cases:
- the application knows nothing (common when parsing a complex stream
containing small objects). This is where Scylla is, similar to an HTTP
server.
- the protocol has rigid structure (fixed size header + variable
payload). The application wants the header in a linearized buffer and
the payload in a free-form iovec. This corresponds to cyanstore.
- the protocol has rigid structure as above. The application wants the
header in a linearized buffer and the payload in its own buffers due to
alignment or ownership requirements. This corresponds to a production
storage server that has alignment requirements for talking to storage
and ownership/placement requirements for caching blocks.
In the first case, input_stream should provide buffers as it reads them.
Buffers can end due to packet boundaries (native stack), input_stream
buffer boundaries (posix stack) or due to exhausting all received data
(both).
In the second case, the socket (perhaps not input_stream) should
linearize the header, provide the payload as a sequence of buffers, and
should attempt not to over-read (over-reading the payload can require
linearization of the next header or trailer)
In the third case, the socket linearizes both the header and payload,
the first into a buffer it allocates by itself, the second into a buffer
provided by the user.
Is this a good set of capabilities to provide?
If it is, then we can implement "linearizes" differently for each stack,
and also depending on whether the buffer is provided by the user or the
stack.
For buffers provided by the stack (which there is only a linearization
requirement, not a placement requirement):
- posix allocates a buffer and issues read() syscalls until the buffer
is full
- native will attempt to temporary_buffer::share() the buffer if it
fits into a packet, and allocate and copy if it does not
For buffers provided by the user (placement requirement)
- posix issues repeated read() syscalls until the buffer is full
- native will memcpy from raw packets into the buffers
Note: "buffer" here can also be an iovec or equivalent. In that case it
will be read into using readv(), and "linearization" only happens within
individual elements of the iovec.
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx