Dear list: I met a problem about DLM(sctp mode) here. I have two nodes; cls4, cls5, the pacemaker configure is like this: node cls4 node cls5 primitive clvm ocf:lvm2:clvmd \ params daemon_timeout="30" daemon_options="-d2" primitive dlm ocf:pacemaker:controld \ op monitor interval="60" timeout="60" primitive o2cb ocf:ocfs2:o2cb \ op monitor interval="60" timeout="60" primitive stonith_sbd stonith:external/sbd \ params sbd_device="/dev/disk/by-path/ip-147.2.207.178:3260-iscsi-iqn.2012-05.com.example:e679021e-470d-493f-902f-45dffad3e32d-lun-0-part2" \ meta target-role="Started" the netstat of sctp ON cls5 is like this: cls5:~ # netstat -apn|grep sctp sctp 192.168.1.5:21064 LISTEN - sctp 0 3 0.0.82.72:6713 192.168.1.5:21064 ESTABLISHED - If I "echo b > /proc/sysrq-trigger" on cls4, cls5 will find cls4 is gone and try to close the sctp association. The graceful shutdown of the SCTP is extremely slow, because the other endpoint has gone and can not answer SHUTDOWN-ACK thing. It always shows: cls5:~ # netstat -apn|grep sctp sctp 192.168.1.5:21064 LISTEN - sctp 0 3 0.0.82.72:6713 192.168.1.5:21064 CLOSING - I suggest if we can add SO_LINGER to abort the association quickly, because this is not necessary to shutdown gracefully, when the endpoint is down. I have this patch: diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index d90909e..05b0240 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -727,11 +727,22 @@ static void process_sctp_notification(struct connection *con, } add_sock(new_con->sock, new_con); + struct linger linger; + linger.l_onoff = 1; + linger.l_linger = 0; + ret = sock_setsockopt (new_con->sock, SOL_SOCKET, SO_LINGER, + (char *)&linger, sizeof (linger)); + if (ret < 0) + log_print("set socket option SO_LINGER failed!"); + + + log_print("connecting to %d sctp association %d", nodeid, (int)sn->sn_assoc_change.sac_assoc_id); new_con->sctp_assoc = sn->sn_assoc_change.sac_assoc_id; new_con->try_new_addr = false; + /* Send any pending writes */ clear_bit(CF_CONNECT_PENDING, &new_con->flags); clear_bit(CF_INIT_PENDING, &new_con->flags) How do you think about it? Dongmao zhang -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster