[RFC 1/1] net/tls(TLS_SW): Handle -ENOSPC error return from device/AES-NI

Pooja Trivedi <poojatrivedi@xxxxxxxxx> · Wed, 7 Oct 2020 15:19:47 -0400

When an -ENOSPC error code is returned by the crypto device or AES-NI
layer, TLS SW path sets an EBADMSG on the socket causing the
application to fail.  In an attempt to address the -ENOSPC in the TLS
SW path, changes were made in tls_sw_sendpage path to trim current
payload off the plain and encrypted messages, and send a 'zero bytes
copied' return to the application.  The following diff shows those
changes:

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9a3d9fedd7aa..4dce4668cb07 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -762,7 +762,7 @@ static int tls_push_record(struct sock *sk, int flags,
  rc = tls_do_encryption(sk, tls_ctx, ctx, req,
        msg_pl->sg.size + prot->tail_size, i);
  if (rc < 0) {
-               if (rc != -EINPROGRESS) {
+              if ((rc != -EINPROGRESS) && (rc != -ENOSPC)) {
                            tls_err_abort(sk, EBADMSG);
                            if (split) {

tls_ctx->pending_open_record_frags = true;

@@ -1073,8 +1073,15 @@ int tls_sw_sendmsg(struct sock *sk, struct
msghdr *msg, size_t size)
  else if (ret == -ENOMEM)
               goto wait_for_memory;
  else if (ret != -EAGAIN) {
-              if (ret == -ENOSPC)
+             if (ret == -ENOSPC) {
                          ret = 0;
                          copied -= try_to_copy;
                          iov_iter_revert(&msg->msg_iter,
msg_pl->sg.size - orig_size);
                          tls_trim_both_msgs(sk, orig_size);
               }
               goto send_end;
     }
  }

@@ -1150,6 +1157,7 @@ static int tls_sw_do_sendpage(struct sock *sk,
struct page *page,
  ssize_t copied = 0;
  bool full_record;
  int record_room;
+ int orig_size;
  int ret = 0;
  bool eor;

@@ -1175,7 +1183,7 @@ static int tls_sw_do_sendpage(struct sock *sk,
struct page *page,
  }

  msg_pl = &rec->msg_plaintext;
-
+ orig_size = msg_pl->sg.size;
  full_record = false;
  record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
  copy = size;

@@ -1219,8 +1227,12 @@ static int tls_sw_do_sendpage(struct sock *sk,
struct page *page,

  else if (ret == -ENOMEM)
             goto wait_for_memory;
  else if (ret != -EAGAIN) {
-            if (ret == -ENOSPC)
+           if (ret == -ENOSPC) {
+                      tls_trim_both_msgs(sk, orig_size);
                        copied -= copy;
                        ret = 0;
              }
              goto sendpage_end;
   }
  }


However, when above patch was tried, the application tried to send
remaining data based on the offset as expected, but encryption failed
due to data integrity issues.  Further debugging revealed that
sk_msg_trim(), called by tls_trim_both_msgs() does not update the page
frag offset.  It seems like sk_msg_trim() needs to subtract the trim
length from the pfrag->offset corresponding to how the sk_msg_alloc()
call increments the pfrag-->offset with the length it charges to the
socket via sk_mem_charge().
When sk_msg_trim() calls sk_mem_uncharge() to uncharge trim length off
the socket, it should also perform
           pfrag->offset -= trim;

While the sk_msg_trim() pfrag->offset subtraction change hasn't been
tried yet, the following hack in TLS layer has been tried and has
correctly worked. This proves that the above observation/theory
regarding pfrag->offset update would be the fix:


+                                       if (ret == -ENOSPC) {
+                                               struct page_frag *pfrag;
+                                               tls_trim_both_msgs(sk,
orig_size);
+
+                                               copied -= copy;
+                                               pfrag = sk_page_frag(sk);
+                                               pfrag->offset -= copy;


What are your thoughts on the best way to fix the issue?
sk_msg_trim() seems like the most logical place, but
suggestions/comments/questions are welcome.

Another thing to think about is whether -EBUSY should be handled
similarly. Vendors have differences and the conditions under which
these error codes are returned are not very consistent when the sidecar
device path is involved.