On 04.06.24 22:15, Bart Van Assche wrote:
On 6/4/24 03:26, Zhu Yanjun wrote:
On 04.06.24 09:25, Shinichiro Kawasaki wrote:
As I noted in another thread [1], KASAN slab-use-after-free is
observed when
I repeat the blktests test case srp/002 with the siw driver [2]. The
kernel
version was v6.10-rc2. The failure is recreated in stable manner when
the test
case is repeated around 30 times. It was not observed with the rxe
driver.
I think this failure is same as that I reported in Jun/2023 [3]. The
Call Trace
reported is quite similar. Also, I confirmed that the trial fix patch
that I
created in Jun/2023 avoided the KASAN failure at srp/002.
"the trial fix patch that I created in Jun/2023" that you mentioned is
the commit in the link?
https://lore.kernel.org/linux-rdma/20230612054237.1855292-1-shinichiro.kawasaki@xxxxxxx/
To me that patch doesn't seem correct. Jason and Leon, is my understanding
correct that you are the maintainers for the iwcm code? Can you please help
with reviewing this patch?
Thanks,
Bart.
From 879ca4e5f9ab8c4ce522b4edc144a3938a2f4afb Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche@xxxxxxx>
Date: Tue, 4 Jun 2024 12:49:44 -0700
Subject: [PATCH] RDMA/iwcm: Fix a use-after-free related to destroying
CM IDs
iw_conn_req_handler() associates a new struct rdma_id_private (conn_id)
with
an existing struct iw_cm_id (cm_id) as follows:
conn_id->cm_id.iw = cm_id;
cm_id->context = conn_id;
cm_id->cm_handler = cma_iw_handler;
rdma_destroy_id() frees both the cm_id and the struct rdma_id_private. Make
sure that cm_work_handler() does not trigger a use-after-free by delaing
freeing of the struct rdma_id_private until all pending work has finished.
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
drivers/infiniband/core/iwcm.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/core/iwcm.c
b/drivers/infiniband/core/iwcm.c
index d608952c6e8e..ea9dc26bf563 100644
--- a/drivers/infiniband/core/iwcm.c
+++ b/drivers/infiniband/core/iwcm.c
@@ -368,8 +368,10 @@ EXPORT_SYMBOL(iw_cm_disconnect);
*
* Clean up all resources associated with the connection and release
* the initial reference taken by iw_create_cm_id.
+ *
+ * Returns true if and only if the last cm_id_priv reference has been
dropped.
*/
-static void destroy_cm_id(struct iw_cm_id *cm_id)
+static bool destroy_cm_id(struct iw_cm_id *cm_id)
Now the type of destroy_cm_id is changed from void to bool.
{
struct iwcm_id_private *cm_id_priv;
struct ib_qp *qp;
@@ -439,7 +441,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
iwpm_remove_mapping(&cm_id->local_addr, RDMA_NL_IWCM);
}
- (void)iwcm_deref_id(cm_id_priv);
+ return iwcm_deref_id(cm_id_priv);
static int iwcm_deref_id(struct iwcm_id_private *cm_id_priv)
The type of iwcm_deref_id is int.
Not sure if we should make iwcm_deref_id and destroy_cm_id have the
different type or not. We can make them use one of the 2 types: int and
bool.
Reviewed-by: Zhu Yanjun <yanjun.zhu@xxxxxxxxx>
Zhu Yanjun
}
/*
@@ -450,7 +452,8 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
*/
void iw_destroy_cm_id(struct iw_cm_id *cm_id)
{
- destroy_cm_id(cm_id);
+ if (!destroy_cm_id(cm_id))
+ flush_workqueue(iwcm_wq);
}
EXPORT_SYMBOL(iw_destroy_cm_id);
@@ -1031,7 +1034,7 @@ static void cm_work_handler(struct work_struct
*_work)
if (!test_bit(IWCM_F_DROP_EVENTS, &cm_id_priv->flags)) {
ret = process_event(cm_id_priv, &levent);
if (ret)
- destroy_cm_id(&cm_id_priv->id);
+ WARN_ON_ONCE(destroy_cm_id(&cm_id_priv->id));
} else
pr_debug("dropping event %d\n", levent.event);
if (iwcm_deref_id(cm_id_priv))