Re: [PATCH net 2/2] net: sctp: fix suboptimal edge-case on non-active active/retrans path selection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 22, 2014 at 01:03:30PM +0200, Daniel Borkmann wrote:
> In SCTP, selection of active (T.ACT) and retransmission (T.RET)
> transports is being done whenever transport control operations
> (UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport().
> 
> Commits 4c47af4d5eb2 ("net: sctp: rework multihoming retransmission
> path selection to rfc4960") and a7288c4dd509 ("net: sctp: improve
> sctp_select_active_and_retran_path selection") have both improved
> it towards a more fine-grained and optimal path selection.
> 
> Currently, the selection algorithm for T.ACT and T.RET is as follows:
> 
> 1) Elect the two most recently used ACTIVE transports T1, T2 for
>    T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used
> 2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set
>    T.ACT<-T.PRI and T.RET<-T1
> 3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1
> 4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where
>    T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI
> 
> Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and
> T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is
> still slightly suboptimal as it can lead to the following scenario:
> 
> Setup:
>         <A>                                <B>
>     T1: p1p1 (10.0.10.10) <==>  .'`)  <==> p1p1 (10.0.10.12)  <= T.PRI
>     T2: p1p2 (10.0.10.20) <==> (_ . ) <==> p1p2 (10.0.10.22)
> 
>     net.sctp.rto_min = 1000
>     net.sctp.path_max_retrans = 2
>     net.sctp.pf_retrans = 0
>     net.sctp.hb_interval = 1000
> 
> T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to
> link flapping). Here, the first time transmission is sent over PF path
> T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks
> are sent over the INACTIVE path T1 (T.PRI), which is not good.
> 
> After the patch, it's choosing better transports in both cases by
> modifying step 4):
> 
> 4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is
>    the most recently used (if avail) in PF, set T.RET<-T.ACT_new
> 
> This will still select a best possible path in PF if available (which
> can also include T.PRI/T.RET), and set both T.ACT/T.RET to it.
> 
> In case sctp_assoc_control_transport() *just* put T.ACT_old into INACTIVE
> as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just
> for a very short while before going back ACTIVE, it will guarantee that
> this path will be reselected for T.ACT/T.RET since T3 (PF) is not
> available.
> 
> Previously, this was not possible, as we would only select between T.PRI
> and T.RET, and a possible T3 would be NULL due to the fact that we have
> just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE
> and would select a suboptimal path when T.PRI/T.RET have worse properties.
> 
> In the case that T.ACT_old permanently went to INACTIVE during this
> transition and there's no PF path available, plus T.PRI and T.RET are
> INACTIVE as well, we would now camp on T.ACT_old, but if everything is
> being INACTIVE there's really not much we can do except hoping for a
> successful HB to bring one of the transports back up again and, thus
> cause a new selection through sctp_assoc_control_transport().
> 
> Now both tests work fine:
> 
> Case 1:
> 
>  1. T1 S(ACTIVE) T.ACT
>     T2 S(ACTIVE) T.RET
> 
>  2. T1 S(ACTIVE) T.ACT, T.RET
>     T2 S(PF)
> 
>  3. T1 S(ACTIVE) T.ACT, T.RET
>     T2 S(INACTIVE)
> 
>  5. T1 S(PF) T.ACT, T.RET
>     T2 S(INACTIVE)
> 
> [ 5.1 T1 S(INACTIVE) T.ACT, T.RET
>       T2 S(INACTIVE) ]
> 
>  6. T1 S(ACTIVE) T.ACT, T.RET
>     T2 S(INACTIVE)
> 
>  7. T1 S(ACTIVE) T.ACT
>     T2 S(ACTIVE) T.RET
> 
> Case 2:
> 
>  1. T1 S(ACTIVE) T.ACT
>     T2 S(ACTIVE) T.RET
> 
>  2. T1 S(PF)
>     T2 S(ACTIVE) T.ACT, T.RET
> 
>  3. T1 S(INACTIVE)
>     T2 S(ACTIVE) T.ACT, T.RET
> 
>  5. T1 S(INACTIVE)
>     T2 S(PF) T.ACT, T.RET
> 
> [ 5.1 T1 S(INACTIVE)
>       T2 S(INACTIVE) T.ACT, T.RET ]
> 
>  6. T1 S(INACTIVE)
>     T2 S(ACTIVE) T.ACT, T.RET
> 
>  7. T1 S(ACTIVE) T.ACT
>     T2 S(ACTIVE) T.RET
> 
> Signed-off-by: Daniel Borkmann <dborkman@xxxxxxxxxx>
> ---
>  net/sctp/associola.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 104fae4..a88b852 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -1356,14 +1356,11 @@ static void sctp_select_active_and_retran_path(struct sctp_association *asoc)
>  		trans_sec = trans_pri;
>  
>  	/* If we failed to find a usable transport, just camp on the
> -	 * primary or retran, even if they are inactive, if possible
> -	 * pick a PF iff it's the better choice.
> +	 * active or pick a PF iff it's the better choice.
>  	 */
>  	if (trans_pri == NULL) {
> -		trans_pri = sctp_trans_elect_best(asoc->peer.primary_path,
> -						  asoc->peer.retran_path);
> -		trans_pri = sctp_trans_elect_best(trans_pri, trans_pf);
> -		trans_sec = asoc->peer.primary_path;
> +		trans_pri = sctp_trans_elect_best(asoc->peer.active_path, trans_pf);
> +		trans_sec = trans_pri;
>  	}
>  
>  	/* Set the active and retran transports. */
> -- 
> 1.7.11.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Acked-by: Neil Horman <nhorman@xxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux