在 2022/3/1 下午9:17, D. Wythe 写道:
From: "D. Wythe" <alibuda@xxxxxxxxxxxxxxxxx>
Remove connections from link group is not synchronous with handling
SMC_LLC_DELETE_RKEY, which means that even the number of connections is
less that SMC_RMBS_PER_LGR_MAX, it does not mean that the connection can
register rtoken successfully later, in other words, the rtoken entry may
have not been released. This will cause an unexpected
SMC_CLC_DECL_ERR_REGRMB to be reported, and then ths smc connection have
to fallback to TCP.
We found that the main reason for the problem dues to following execution
sequence:
Server Conn A: Server Conn B: Client Conn B:
smc_lgr_unregister_conn
smc_lgr_register_conn
smc_clc_send_accept ->
smc_rtoken_add
smcr_buf_unuse
-> Client Conn A:
smc_rtoken_delete
smc_lgr_unregister_conn() makes current link available to assigned to new
incoming connection, while smcr_buf_unuse() has not executed yet, which
means that smc_rtoken_add may fail because of insufficient rtoken_entry,
reversing their execution order will avoid this problem.
Fixes: 3e034725c0d8 ("net/smc: common functions for RMBs and send buffers")
Signed-off-by: D. Wythe <alibuda@xxxxxxxxxxxxxxxxx>
---
net/smc/smc_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 2f321d2..c9c3a68 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -1161,8 +1161,8 @@ void smc_conn_free(struct smc_connection *conn)
cancel_work_sync(&conn->abort_work);
}
if (!list_empty(&lgr->list)) {
- smc_lgr_unregister_conn(conn);
smc_buf_unuse(conn, lgr); /* allow buffer reuse */
+ smc_lgr_unregister_conn(conn);
}
if (!lgr->conns_num)
I have two patch for this issue, and i missed one, I'll post it in v2
series.