Patch "RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests" has been added to the 5.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests

to the 5.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     rdma-cma-ensure-rdma_addr_cancel-happens-before-issuing-more-requests.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 305d568b72f17f674155a2a8275f865f207b3808 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@xxxxxxxxxx>
Date: Thu, 16 Sep 2021 15:34:46 -0300
Subject: RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests

From: Jason Gunthorpe <jgg@xxxxxxxxxx>

commit 305d568b72f17f674155a2a8275f865f207b3808 upstream.

The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
on the same id_priv. While this cannot happen without going through the
work, it violates the invariant that the same address resolution
background request cannot be active twice.

       CPU 1                                  CPU 2

rdma_resolve_addr():
  RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
  rdma_resolve_ip(addr_handler)  #1

			 process_one_req(): for #1
                          addr_handler():
                            RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
                            mutex_unlock(&id_priv->handler_mutex);
                            [.. handler still running ..]

rdma_resolve_addr():
  RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
  rdma_resolve_ip(addr_handler)
    !! two requests are now on the req_list

rdma_destroy_id():
 destroy_id_handler_unlock():
  _destroy_id():
   cma_cancel_operation():
    rdma_addr_cancel()

                          // process_one_req() self removes it
		          spin_lock_bh(&lock);
                           cancel_delayed_work(&req->work);
	                   if (!list_empty(&req->list)) == true

      ! rdma_addr_cancel() returns after process_on_req #1 is done

   kfree(id_priv)

			 process_one_req(): for #2
                          addr_handler():
	                    mutex_lock(&id_priv->handler_mutex);
                            !! Use after free on id_priv

rdma_addr_cancel() expects there to be one req on the list and only
cancels the first one. The self-removal behavior of the work only happens
after the handler has returned. This yields a situations where the
req_list can have two reqs for the same "handle" but rdma_addr_cancel()
only cancels the first one.

The second req remains active beyond rdma_destroy_id() and will
use-after-free id_priv once it inevitably triggers.

Fix this by remembering if the id_priv has called rdma_resolve_ip() and
always cancel before calling it again. This ensures the req_list never
gets more than one item in it and doesn't cost anything in the normal flow
that never uses this strange error path.

Link: https://lore.kernel.org/r/0-v1-3bc675b8006d+22-syz_cancel_uaf_jgg@xxxxxxxxxx
Cc: stable@xxxxxxxxxxxxxxx
Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")
Reported-by: syzbot+dc3dfba010d7671e05f5@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Anton Gusev <aagusev@xxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
 drivers/infiniband/core/cma.c      |   23 +++++++++++++++++++++++
 drivers/infiniband/core/cma_priv.h |    1 +
 2 files changed, 24 insertions(+)

--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1792,6 +1792,14 @@ static void cma_cancel_operation(struct
 {
 	switch (state) {
 	case RDMA_CM_ADDR_QUERY:
+		/*
+		 * We can avoid doing the rdma_addr_cancel() based on state,
+		 * only RDMA_CM_ADDR_QUERY has a work that could still execute.
+		 * Notice that the addr_handler work could still be exiting
+		 * outside this state, however due to the interaction with the
+		 * handler_mutex the work is guaranteed not to touch id_priv
+		 * during exit.
+		 */
 		rdma_addr_cancel(&id_priv->id.route.addr.dev_addr);
 		break;
 	case RDMA_CM_ROUTE_QUERY:
@@ -3401,6 +3409,21 @@ int rdma_resolve_addr(struct rdma_cm_id
 		if (dst_addr->sa_family == AF_IB) {
 			ret = cma_resolve_ib_addr(id_priv);
 		} else {
+			/*
+			 * The FSM can return back to RDMA_CM_ADDR_BOUND after
+			 * rdma_resolve_ip() is called, eg through the error
+			 * path in addr_handler(). If this happens the existing
+			 * request must be canceled before issuing a new one.
+			 * Since canceling a request is a bit slow and this
+			 * oddball path is rare, keep track once a request has
+			 * been issued. The track turns out to be a permanent
+			 * state since this is the only cancel as it is
+			 * immediately before rdma_resolve_ip().
+			 */
+			if (id_priv->used_resolve_ip)
+				rdma_addr_cancel(&id->route.addr.dev_addr);
+			else
+				id_priv->used_resolve_ip = 1;
 			ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr,
 					      &id->route.addr.dev_addr,
 					      timeout_ms, addr_handler,
--- a/drivers/infiniband/core/cma_priv.h
+++ b/drivers/infiniband/core/cma_priv.h
@@ -89,6 +89,7 @@ struct rdma_id_private {
 	u8			reuseaddr;
 	u8			afonly;
 	u8			timeout;
+	u8 used_resolve_ip;
 	enum ib_gid_type	gid_type;
 
 	/*


Patches currently in stable-queue which might be from jgg@xxxxxxxxxx are

queue-5.10/rdma-bnxt_re-fix-to-remove-an-unnecessary-log.patch
queue-5.10/rdma-hns-fix-hns_roce_table_get-return-value.patch
queue-5.10/rdma-bnxt_re-fix-to-remove-unnecessary-return-labels.patch
queue-5.10/rdma-cma-ensure-rdma_addr_cancel-happens-before-issuing-more-requests.patch
queue-5.10/rdma-hns-use-refcount_t-apis-for-hem.patch
queue-5.10/rdma-remove-uverbs_ex_cmd_mask-values-that-are-linke.patch
queue-5.10/rdma-hns-fix-coding-style-issues.patch
queue-5.10/ib-hfi1-fix-wrong-mmu_node-used-for-user-sdma-packet.patch
queue-5.10/rdma-bnxt_re-remove-a-redundant-check-inside-bnxt_re.patch
queue-5.10/rdma-bnxt_re-disable-kill-tasklet-only-if-it-is-enab.patch
queue-5.10/ib-hfi1-use-bitmap_zalloc-when-applicable.patch
queue-5.10/rdma-hns-clean-the-hardware-related-code-for-hem.patch
queue-5.10/rdma-bnxt_re-use-unique-names-while-registering-inte.patch
queue-5.10/ib-hfi1-fix-sdma.h-tx-num_descs-off-by-one-errors.patch



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux