On 2022/9/20 12:55, dust.li wrote:
On Tue, Sep 20, 2022 at 10:53:54AM +0800, Wen Gu wrote:
SMC-R tests the viability of link by sending out TEST_LINK LLC
messages over RoCE fabric when connections on link have been
idle for a time longer than keepalive interval (testlink time).
But using tcp_keepalive_time as testlink time maybe not quite
suitable because it is default no less than two hours[1], which
is too long for single link to find peer dead. The active host
will still use peer-dead link (QP) sending messages, and can't
find out until get IB_WC_RETRY_EXC_ERR error CQEs, which takes
more time than TEST_LINK timeout (SMC_LLC_WAIT_TIME) normally.
So this patch introduces a independent sysctl for SMC-R to set
link keepalive time, in order to detect link down in time. The
default value is 30 seconds.
[1] https://www.rfc-editor.org/rfc/rfc1122#page-101
Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx>
---
/* called after lgr was removed from lgr_list */
diff --git a/net/smc/smc_llc.h b/net/smc/smc_llc.h
index 4404e52..1de9a29 100644
--- a/net/smc/smc_llc.h
+++ b/net/smc/smc_llc.h
@@ -19,6 +19,7 @@
#define SMC_LLC_WAIT_FIRST_TIME (5 * HZ)
#define SMC_LLC_WAIT_TIME (2 * HZ)
+#define SMC_LLC_TESTLINK_DEFAULT_TIME 30
I'm wondering why we don't follow the upper to macros using (30 * HZ) ?
Thanks for the reivew.
Because the value of sysctl_smcr_testlink_time is in seconds, and the value
of llc_testlink_time is jiffies.
I have thought about
1) using proc_dointvec_jiffies as sysctl's proc_handler just like TCP does.
But proc_dointvec_jiffies has no minimum limit, value 0 makes no sense for SMC testlink.
2) using proc_dointvec_ms_jiffies_minmax as proc_handler. But millisecond interval
seems expensive for SMC test link.
So, I choose to use proc_dointvec_minmax, make sysctl_smcr_testlink_time in
seconds, and convert to jiffies when assigning to llc_testlink_time.
Thanks,
Wen Gu.