On Mon, Jul 04, 2011 at 09:50:19AM -0400, Thomas Graf wrote: > When initiating a graceful shutdown while having data chunks > on the retransmission queue with a peer which is in zero > window mode the shutdown is never completed because the > retransmission error count is reset periodically by the > following two rules: > > - Do not timeout association while doing zero window probe. > - Reset overall error count when a heartbeat request has > been acknowledged. > > The graceful shutdown will wait for all outstanding TSN to > be acknowledged before sending the SHUTDOWN request. This > never happens due to the peer's zero window not acknowledging > the continuously retransmitted data chunks. Although the > error counter is incremented for each failed retransmission, > the receiving of the SACK announcing the zero window clears > the error count again immediately. Also heartbeat requests > continue to be sent periodically. The peer acknowledges these > requests causing the error counter to be reset as well. > > This patch changes behaviour to only reset the overall error > counter for the above rules while not in shutdown. After > reaching the maximum number of retransmission attempts, the > T5 shutdown guard timer is scheduled to give the receiver > some additional time to recover. The timer is stopped as soon > as the receiver acknowledges any data. > > The issue can be easily reproduced by establishing a sctp > association over the loopback device, constantly queueing > data at the sender while not reading any at the receiver. > Wait for the window to reach zero, then initiate a shutdown > by killing both processes simultaneously. The association > will never be freed and the chunks on the retransmission > queue will be retransmitted indefinitely. > > Signed-off-by: Thomas Graf <tgraf@xxxxxxxxxxxxx> <snip> > --- a/net/sctp/sm_statefuns.c > +++ b/net/sctp/sm_statefuns.c > @@ -5154,7 +5154,7 @@ sctp_disposition_t sctp_sf_do_9_2_start_shutdown( > * The sender of the SHUTDOWN MAY also start an overall guard timer > * 'T5-shutdown-guard' to bound the overall time for shutdown sequence. > */ > - sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_START, > + sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_RESTART, > SCTP_TO(SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD)); > How come you're modifying this chunk to use TIMER_RESTART rather than TIMER_START? start shutdown is where the t5 timer is actually started, isn't it? The rest, I think looks ok to me. Neil -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html