Re: [PATCH] net: tls: fix tls with sk_redirect using a BPF verdict.

Vadim Fedorenko <vfedorenko@xxxxxxxx> · Thu, 30 Jun 2022 16:21:42 +0100

On 29.06.2022 08:00, John Fastabend wrote:
Jakub Kicinski wrote:
On Tue, 28 Jun 2022 17:25:05 +0200 Julien Salleyron wrote:
This patch allows to use KTLS on a socket where we apply sk_redirect using a BPF
verdict program.


You'll also need a signed-off-by.

Without this patch, we see that the data received after the redirection are
decrypted but with an incorrect offset and length. It seems to us that the
offset and length are correct in the stream-parser data, but finally not applied
in the skb. We have simply applied those values to the skb.

In the case of regular sockets, we saw a big performance improvement from
applying redirect. This is not the case now with KTLS, may be related to the
following point.

It's because kTLS does a very expensive reallocation and copy for the
non-zerocopy case (which currently means all of TLS 1.3). I have
code almost ready to fix that (just needs to be reshuffled into
upstreamable patches). Brings us up from 5.9 Gbps to 8.4 Gbps per CPU
on my test box with 16k records. Probably much more than that with
smaller records.

Also on my list open-ssl support is lacking ktls support for both
direction in tls1.3 iirc. We have a couple test workloads pinned on
1.2 for example which really isn't great.


AFAIK in-kernel TLS 1.3 is supported in OpenSSL 3.0, I implemented TX part long 
time ago and was fixing some parts while it was 3.0-alpha. Not sure about RX.
Or are you talking about zero-copy implementation?


It is still necessary to perform a read operation (never triggered) from user
space despite the redirection. It makes no sense, since this read operation is
not necessary on regular sockets without KTLS.

We do not see how to fix this problem without a change of architecture, for
example by performing TLS decrypt directly inside the BPF verdict program.

An example program can be found at
https://github.com/juliens/ktls-bpf_redirect-example/

Co-authored-by: Marc Vertes <mvertes@xxxxxxx>
---
  net/tls/tls_sw.c                           | 6 ++++++
  tools/testing/selftests/bpf/test_sockmap.c | 8 +++-----
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 0513f82b8537..a409f8a251db 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1839,8 +1839,14 @@ int tls_sw_recvmsg(struct sock *sk,
  			if (bpf_strp_enabled) {
  				/* BPF may try to queue the skb */
  				__skb_unlink(skb, &ctx->rx_list);
+
  				err = sk_psock_tls_strp_read(psock, skb);
+
  				if (err != __SK_PASS) {
+                    if (err == __SK_REDIRECT) {
+                        skb->data += rxm->offset;
+                        skb->len = rxm->full_len;
+                    }

IDK what this is trying to do but I certainly depends on the fact
we run skb_cow_data() and is not "generally correct" :S

Ah also we are not handling partially consumed correctly either.
Seems we might pop off the skb even when we need to continue;

Maybe look at how skb_copy_datagram_msg() goes below because it
fixes the skb copy up with the rxm->offset. But, also we need to
do this repair before sk_psock_tls_strp_read I think so that
the BPF program reads the correct data in all cases? I guess
your sample program (and selftests for that matter) just did
the redirect without reading the data?

Thanks!