From: Daniel Borkmann > > Problem statement: 1) both paths (primary path1 and alternate > path2) are up after the association has been established i.e., > HB packets are normally exchanged, 2) path2 gets inactive after > path_max_retrans * max_rto timed out (i.e. path2 is down completely), > 3) now, if a transmission times out on the only surviving/active > path1 (any ~1sec network service impact could cause this like > a channel bonding failover), then the retransmitted packets are > sent over the inactive path2; this happens with partial failover > and without it. > > Besides not being optimal in the above scenario, a small failure > or timeout in the only existing path has the potential to cause > long delays in the retransmission (depending on RTO_MAX) until > the still active path is reselected. The current behaviour doesn't seem very good - real networks tend to have non-zero packet loss these days (for all sorts of reasons). I guess that under moderate traffic flow retransmit requests from the remote system recover the data before a timeout actually occurs. That probably means that a path with a high error rate will continue to be used when an alternate path would be much better. I was wondering whether it is valid (or even reasonable) to send the retransmit down multiple paths? Particularly if they are not known to be working. Or maybe resend heartbeats in a desperate attempt to find a working path? Do you guys know which kernel version(s) have that patch? We have a few customers using sctp (for m3ua) and I really ought to keep track of the 'good' and 'bad' kernel versions. David -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html