On 9/8/23 18:55, Bernard Metzler wrote:
-----Original Message-----
From: Guoqing Jiang <guoqing.jiang@xxxxxxxxx>
Sent: Friday, 8 September 2023 03:35
To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx
Cc: jgg@xxxxxxxx; leon@xxxxxxxxxx
Subject: [EXTERNAL] Re: [PATCH v2] RDMA/siw: Fix connection failure
handling
Hi,
On 9/5/23 22:58, Bernard Metzler wrote:
In case immediate MPA request processing fails, the newly
created endpoint unlinks the listening endpoint and is
ready to be dropped. This special case was not handled
correctly by the code handling the later TCP socket close,
causing a NULL dereference crash in siw_cm_work_handler()
when dereferencing a NULL listener. We now also cancel
the useless MPA timeout, if immediate MPA request
processing fails.
This patch furthermore simplifies MPA processing in general:
Scheduling a useless TCP socket read in sk_data_ready() upcall
is now surpressed, if the socket is already moved out of
TCP_ESTABLISHED state.
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Bernard Metzler<bmt@xxxxxxxxxxxxxx>
---
ChangeLog v1->v2:
- Move debug message to now conditional listener drop
---
drivers/infiniband/sw/siw/siw_cm.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw_cm.c
b/drivers/infiniband/sw/siw/siw_cm.c
index a2605178f4ed..43e776073f49 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -976,6 +976,7 @@ static void siw_accept_newconn(struct siw_cep *cep)
siw_cep_put(cep);
new_cep->listen_cep = NULL;
if (rv) {
+ siw_cancel_mpatimer(new_cep);
One question, why siw_cm_work_handler set cep->mpa_timer to NULL instead
of call
siw_cancel_mpatimer like above? Could it be memory leak issue for
cep->mpa_timer?
Thanks for reviewing.
Above in the proposed patch, we cancel a pending
MPA timer.
You are referring to siw_cm.c:1115
case SIW_CM_WORK_MPATIMEOUT:
cep->mpa_timer = NULL;
We explicitly set it NULL here since we are handling
the timeout, so the MPA timer has fired and we
cannot cancel it anymore. At the end of
siw_cm_work_handler(), any work, including the MPA
timeout, gets recycled to the cep's list of free work
items via siw_put_work(). This list is recycled only at
the end of the cep's lifetime.
Got it, and thanks for detailed explanation!
Let's say mpa_timer is set to NULL before call siw_cancel_mpatimer.
siw_cep_set_free(new_cep);
goto error;
}
@@ -1100,9 +1101,12 @@ static void siw_cm_work_handler(struct work_struct
*w)
/*
* Socket close before MPA request received.
*/
- siw_dbg_cep(cep, "no mpareq: drop listener\n");
- siw_cep_put(cep->listen_cep);
- cep->listen_cep = NULL;
+ if (cep->listen_cep) {
+ siw_dbg_cep(cep,
+ "no mpareq: drop listener\n");
+ siw_cep_put(cep->listen_cep);
+ cep->listen_cep = NULL;
+ }
}
}
release_cep = 1;
@@ -1227,7 +1231,11 @@ static void siw_cm_llp_data_ready(struct sock *sk)
if (!cep)
goto out;
- siw_dbg_cep(cep, "state: %d\n", cep->state);
+ siw_dbg_cep(cep, "cep state: %d, socket state %d\n",
+ cep->state, sk->sk_state);
+
+ if (sk->sk_state != TCP_ESTABLISHED)
+ goto out;
Maybe split above to another patch makes more sense, just my $0.02.
I preferred to have it in one patch, since it was all triggered
by your patch b056327bee09
'RDMA/siw: Balance the reference of cep->kref in the error path'
Alarmed by that, I re-tested many error cases including a failing
MPA request processing, which triggered this NULL bug. It also
showed me this useless extra read event processing when the TCP
socket is closing while the endpoint is still in MPA handshake mode.
Good to know the background :).
But you are right, it is not necessarily related. I can split it.
With or without the split, Acked-by: Guoqing Jiang <guoqing.jiang@xxxxxxxxx>
Thanks,
Guoqing