Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > However, this might well cause a malfunction in UDP, for example. > > MSG_MORE corks the current packet, so if I ask sendfile() say shove 32K > > into a packet, if, say, 16K is read from the source and entirely > > transcribed into the packet, > > If you use splice() for UDP, I don't think you would normally expect > to get all that well-defined packet boundaries. Actually, it will. Attempting to overfill a UDP packet with splice will get you -EMSGSIZE. It won't turn a splice into more than one UDP packet. I wonder if the right solution actually is to declare that the problem is userspace's. If you ask it to splice Z amount of data and it can't manage that because the source dries up prematurely, then make it so that you assume it always passed MSG_MORE and returns a short splice to userspace. Userspace can retry the splice/sendfile or do an empty sendmsg() to cap the message (or cancel it). Perhaps flushing a short message is actually a *bad* idea. The answer then might be to make TLS handle a zero-length send() and fix the test cases. Would the attached changes then work for you? David --- diff --git a/fs/splice.c b/fs/splice.c index 3e06611d19ae..237688b0700b 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -956,13 +956,17 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd, */ bytes = 0; len = sd->total_len; + + /* Don't block on output, we have to drain the direct pipe. */ flags = sd->flags; + sd->flags &= ~SPLICE_F_NONBLOCK; /* - * Don't block on output, we have to drain the direct pipe. + * We signal MORE until we've read sufficient data to fulfill the + * request and we keep signalling it if the caller set it. */ - sd->flags &= ~SPLICE_F_NONBLOCK; more = sd->flags & SPLICE_F_MORE; + sd->flags |= SPLICE_F_MORE; WARN_ON_ONCE(!pipe_empty(pipe->head, pipe->tail)); @@ -978,14 +982,12 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd, sd->total_len = read_len; /* - * If more data is pending, set SPLICE_F_MORE - * If this is the last data and SPLICE_F_MORE was not set - * initially, clears it. + * If we now have sufficient data to fulfill the request then + * we clear SPLICE_F_MORE if it was not set initially. */ - if (read_len < len) - sd->flags |= SPLICE_F_MORE; - else if (!more) + if (read_len >= len && !more) sd->flags &= ~SPLICE_F_MORE; + /* * NOTE: nonblocking mode only applies to the input. We * must not do the output in nonblocking mode as then we diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index f63e4405cf34..5d48391da16c 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -995,6 +995,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg, } } + if (!msg_data_left(msg) && eor) + goto copied; + while (msg_data_left(msg)) { if (sk->sk_err) { ret = -sk->sk_err; diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index e699548d4247..7df31583f2a4 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -377,7 +377,7 @@ static void chunked_sendfile(struct __test_metadata *_metadata, char buf[TLS_PAYLOAD_MAX_LEN]; uint16_t test_payload_size; int size = 0; - int ret; + int ret = 0; char filename[] = "/tmp/mytemp.XXXXXX"; int fd = mkstemp(filename); off_t offset = 0; @@ -398,6 +398,9 @@ static void chunked_sendfile(struct __test_metadata *_metadata, size -= ret; } + if (ret < chunk_size) + EXPECT_EQ(send(self->fd, "", 0, 0), 0); + EXPECT_EQ(recv(self->cfd, buf, test_payload_size, MSG_WAITALL), test_payload_size);