Makes sense. Thanks for your response. Atalay On Fri, May 15, 2015 at 9:39 AM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote: > On Wed, May 13, 2015 at 10:27:05AM -0400, Atalay Ozgovde wrote: >> We are essentially trying to implement "fire and forget" TCP with >> SCTP. That is, a single server port that services clients that >> establish separate connections. We are not broadcasting same messages >> over to every client. Each client can issue different sets of requests >> from the server and receive response as streams of messages over its >> own channel. Messages are time sensitive and expire fast, therefore >> retransmission of lost messages is not desirable. Due to large number >> of clients and uniqueness of clients' needs UDP is not an option. >> We have one-to-one SCTP connections to clients where we send unordered >> messages with short time to live parameters. When clients' network >> degrade to about 60% packet loss (which happens often in some >> regions), server can continue to write (sctp_sendmsg) to the clients >> connection but wireshark shows that message don't get to the wire >> (blocked at the transport layer). >> We enabled SCTP Kernel logging and here is what we see: >> We start getting the following for each write attempt (some lines are >> removed for brevity): >> May 11 09:58:43 localhost kernel: [250220.363867] sctp: >> sctp_outq_flush: could not transmit TSN: 0x0, status: 2 >> May 11 09:58:43 localhost kernel: [250220.363870] sctp: sctp_do_sm >> post sfx: error 0, asoc ffff88036943a000[STATE_ >> ESTABLISHED] >> >> After several of the above eventually we get: >> >> May 11 09:58:43 localhost kernel: [250220.363871] sctp: We sent primitively. >> May 11 09:58:43 localhost kernel: [250220.363933] sctp: sctp_close(sk: >> 0xffff880369693dc0, timeout:0) >> May 11 09:58:43 localhost kernel: [250220.363938] sctp: sctp_do_sm >> prefn: ep ffff8807d159e200, EVENT_T_PRIMITIVE, PRIMITIVE_SHUTDOWN, >> asoc ffff880 36943a000[STATE_ESTABLISHED], >> sctp_sf_do_9_2_prm_shutdown >> May 11 09:58:43 localhost kernel: [250220.363941] sctp: sctp_do_sm >> postfn: asoc ffff88036943a000, status: DISPOSITION_CONSUME >> May 11 09:58:43 localhost kernel: [250220.363943] sctp: >> sctp_cmd_new_state: asoc ffff88036943a000[STATE_SHUTDOWN_PENDING] >> May 11 09:58:43 localhost kernel: [250220.363945] sctp: sctp_do_sm >> post sfx: error 0, asoc ffff88036943a000[STATE_CLOSED] >> May 11 09:58:43 localhost kernel: [250220.363947] sctp: >> sctp_destroy_sock(sk: ffff880369693dc0) >> >> SCTP is abandoning the connection due to status = 2. I found in the >> kernel SCTP source that it means: SCTP_XMIT_RWND_FULL. ie. rwindow is >> full. Clearly SCTP reacting to what is sees as heavy congestion. >> We can detect congestion before connection is closed (using sctp >> events), my question is is there a way to reset a connection >> (association) without having to close it? Alternatively, is there a >> way to relax congestion parameters so that we can continue using the >> connection as we don't care about the packet loss? >> >> Thanks, >> >> Atalay > > Theres no way to reset a failed connection short of closing it an > re-establishing a new one. You also can't "relax" the rwnd congestion parameter > directly, but you can change sctp_mem/sctp_rmem so that newly established > connections compute a larger receive window when they are set up > Neil > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html