On Wed, Jul 18, 2012 at 05:23:05PM -0400, Vlad Yasevich wrote: > On 07/18/2012 02:01 PM, Neil Horman wrote: > >I've seen several attempts recently made to do quick failover of sctp transports > >by reducing various retransmit timers and counters. While its possible to > >implement a faster failover on multihomed sctp associations, its not > >particularly robust, in that it can lead to unneeded retransmits, as well as > >false connection failures due to intermittent latency on a network. > > > >Instead, lets implement the new ietf quick failover draft found here: > >http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05 > > > >This will let the sctp stack identify transports that have had a small number of > >errors, and avoid using them quickly until their reliability can be > >re-established. I've tested this out on two virt guests connected via multiple > >isolated virt networks and believe its in compliance with the above draft and > >works well. > > > >Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx> > >CC: Vlad Yasevich <vyasevich@xxxxxxxxx> > >CC: Sridhar Samudrala <sri@xxxxxxxxxx> > >CC: "David S. Miller" <davem@xxxxxxxxxxxxx> > >CC: linux-sctp@xxxxxxxxxxxxxxx > > > >--- > >Change notes: > > > >V2) > >- Added socket option API from section 6.1 of the specification, as per > >request from Vlad. Adding this socket option allows us to alter both the path > >maximum retransmit value and the path partial failure threshold for each > >transport and the association as a whole. > > > >- Added a per transport pf_retrans value, and initialized it from the > >association value. This makes each transport independently configurable as per > >the socket option above, and prevents changes in the sysctl from bleeding into > >an already created association. > >--- > > Documentation/networking/ip-sysctl.txt | 14 +++++ > > include/net/sctp/constants.h | 1 + > > include/net/sctp/structs.h | 11 +++- > > include/net/sctp/user.h | 11 ++++ > > net/sctp/associola.c | 36 ++++++++++-- > > net/sctp/outqueue.c | 6 +- > > net/sctp/sm_sideeffect.c | 33 ++++++++++- > > net/sctp/socket.c | 96 ++++++++++++++++++++++++++++++++ > > net/sctp/sysctl.c | 9 +++ > > net/sctp/transport.c | 4 +- > > 10 files changed, 206 insertions(+), 15 deletions(-) > > > > [ snip ] > > >diff --git a/net/sctp/socket.c b/net/sctp/socket.c > >index b3b8a8d..dfffece 100644 > >--- a/net/sctp/socket.c > >+++ b/net/sctp/socket.c > >@@ -3470,6 +3470,52 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, char __user *optval, > > } > > > > > >+/* > >+ * SCTP_PEER_ADDR_THLDS > >+ * > >+ * This option allows us to alter the partially failed threshold for one or all > >+ * transports in an association. See Section 6.1 of: > >+ * http://www.ietf.org/id/draft-nishida-tsvwg-sctp-failover-05.txt > >+ */ > >+static int sctp_setsockopt_paddr_thresholds(struct sock *sk, > >+ char __user *optval, > >+ unsigned int optlen) > >+{ > >+ struct sctp_paddrthlds val; > >+ struct sctp_transport *trans; > >+ struct sctp_association *asoc; > >+ > >+ if (optlen < sizeof(struct sctp_paddrthlds)) > >+ return -EINVAL; > >+ if (copy_from_user(&val, (struct sctp_paddrthlds __user *)optval, > >+ optlen)) > >+ return -EFAULT; > > What if optlen is bigger? You going to trash the stack. > > >+ > >+ if (sctp_is_any(sk, (const union sctp_addr *)&val.spt_address)) { > >+ asoc = sctp_id2assoc(sk, val.spt_assoc_id); > >+ if (!asoc) > >+ return -ENOENT; > >+ list_for_each_entry(trans, &asoc->peer.transport_addr_list, > >+ transports) { > >+ trans->pathmaxrxt = val.spt_pathmaxrxt; > >+ trans->pf_retrans = val.spt_pathpfthld; > > You want to make sure that the values aren't 0. Otherwise, you'll > set the pathmaxrxt to 0 and that would be bad. > > >+ } > >+ > >+ asoc->pf_retrans = val.spt_pathpfthld; > >+ asoc->pathmaxrxt = val.spt_pathmaxrxt; > > Ditto. > > >+ } else { > >+ trans = sctp_addr_id2transport(sk, &val.spt_address, > >+ val.spt_assoc_id); > >+ if (!trans) > >+ return -ENOENT; > >+ > >+ trans->pathmaxrxt = val.spt_pathmaxrxt; > >+ trans->pf_retrans = val.spt_pathpfthld; > > Ditto. > > >+ } > >+ > >+ return 0; > >+} > >+ > > /* API 6.2 setsockopt(), getsockopt() > > * > > * Applications use setsockopt() and getsockopt() to set or retrieve > >@@ -3619,6 +3665,9 @@ SCTP_STATIC int sctp_setsockopt(struct sock *sk, int level, int optname, > > case SCTP_AUTO_ASCONF: > > retval = sctp_setsockopt_auto_asconf(sk, optval, optlen); > > break; > >+ case SCTP_PEER_ADDR_THLDS: > >+ retval = sctp_setsockopt_paddr_thresholds(sk, optval, optlen); > >+ break; > > default: > > retval = -ENOPROTOOPT; > > break; > >@@ -5490,6 +5539,50 @@ static int sctp_getsockopt_assoc_ids(struct sock *sk, int len, > > return 0; > > } > > > >+/* > >+ * SCTP_PEER_ADDR_THLDS > >+ * > >+ * This option allows us to fetch the partially failed threshold for one or all > >+ * transports in an association. See Section 6.1 of: > >+ * http://www.ietf.org/id/draft-nishida-tsvwg-sctp-failover-05.txt > >+ */ > >+static int sctp_getsockopt_paddr_thresholds(struct sock *sk, > >+ char __user *optval, > >+ int optlen) > >+{ > >+ struct sctp_paddrthlds val; > >+ struct sctp_transport *trans; > >+ struct sctp_association *asoc; > >+ > >+ if (optlen < sizeof(struct sctp_paddrthlds)) > >+ return -EINVAL; > >+ if (copy_from_user(&val, (struct sctp_paddrthlds __user *)optval, optlen)) > >+ return -EFAULT; > > Again, trashing the stack if optlen and optval are bigger. > > -vlad Ack, I'll fix these up and repost. Thanks! Neil -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html