RE: Suspected renege problem in sctp

"Roberts, Lee A." <lee.roberts@xxxxxx> · Thu, 31 Jan 2013 04:30:07 +0000

Vlad,

The test code that I'm running at the moment has changes similar to the following.
I think we want to peek at the tail of the queue---and not dequeue (or unlink) the
data until we're sure we want to renege.

                                                -- Lee Roberts


# diff -c ulpqueue.c~ ulpqueue.c
*** ulpqueue.c~ 2012-10-09 14:31:35.000000000 -0600
--- ulpqueue.c  2013-01-30 21:22:49.000000000 -0700
***************
*** 963,973 ****
  
        tsnmap = &ulpq->asoc->peer.tsn_map;
  
!       while ((skb = __skb_dequeue_tail(list)) != NULL) {
!               freed += skb_headlen(skb);
                event = sctp_skb2event(skb);
                tsn = event->tsn;
  
                sctp_ulpevent_free(event);
                sctp_tsnmap_renege(tsnmap, tsn);
                if (freed >= needed)
--- 963,976 ----
  
        tsnmap = &ulpq->asoc->peer.tsn_map;
  
!       while ((skb = skb_peek_tail(list)) != NULL) {
                event = sctp_skb2event(skb);
                tsn = event->tsn;
+               if (TSN_lte(tsn, sctp_tsnmap_get_ctsn(tsnmap)))
+                       break;
  
+               __skb_unlink(skb, list);
+               freed += skb_headlen(skb);
                sctp_ulpevent_free(event);
                sctp_tsnmap_renege(tsnmap, tsn);
                if (freed >= needed)
#



-----Original Message-----
From: Vlad Yasevich [mailto:vyasevich@xxxxxxxxx] 
Sent: Wednesday, January 30, 2013 7:59 PM
To: Montgomery, Bob
Cc: Roberts, Lee A.; linux-sctp@xxxxxxxxxxxxxxx
Subject: Re: Suspected renege problem in sctp

On 01/30/2013 07:51 PM, Bob Montgomery wrote:
> Vlad,
>
> If you're not the right guy to ask can you let me know who might be?

Hi Bob.  I am still the right person, but it would be even better to
address these types of questions and writeups to linux-sctp@xxxxxxxxxxxxxxx.

>
> We've been investigating sctpspray hangs for quite some time.
>
> I think we're seeing a case where a renege operation is removing
> fragments from the ulp reasm queue that are at or below the value
> of cumulative_tsn_ack_point.  We put a BUG statement in
> sctp_tsnmap_renege so we'd crash instead of ignoring this case:
>
>          if (TSN_lt(tsn, map->base_tsn))
>                  return;


Hmm..  Looking at the reneging functions there is no TSN checking at 
all.  You are completely right.  We MUST not renege DATA that has moved
the cumulative_tsn_ack_point.

Adding something like this in sctp_ulpq_renege_list should fix this:

diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index ada1746..9bd94e6 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -970,10 +970,14 @@ static __u16 sctp_ulpq_renege_list(struct 
sctp_ulpq *ulpq,
         tsnmap = &ulpq->asoc->peer.tsn_map;

         while ((skb = __skb_dequeue_tail(list)) != NULL) {
-               freed += skb_headlen(skb);
                 event = sctp_skb2event(skb);
                 tsn = event->tsn;
-
+
+               /* Make sure we do not renege below CTSN */
+               if (TSN_lte(tsn, sctp_tsnmap_get_ctsn(tsnmap)))
+                       break;
+
+               freed += skb_headlen(skb);
                 sctp_ulpevent_free(event);
                 sctp_tsnmap_renege(tsnmap, tsn);
                 if (freed >= needed)

I think there also might be a bug here when reneging from the ordered 
queue and the message has been reassembled.  I need to look at that a 
bit more.

-vlad

>
> And after two days of sctpspray pounding, hit the bug.
>
> Here's a partial write-up using examples from the core file:
>
> =========================================================================
> Fact 1:  a renege operation will only be launched if there's a gap in
> the tsn.
>
> So a reassembly queue like this one would not be set upon:
>
> PID 1784
> sctp_association 0xffff88041b6a2000
>
>        tsn_map = 0xffff88041dd8d560,
>        base_tsn = 0x55751715,
>        cumulative_tsn_ack_point = 0x55751714,
>        max_tsn_seen = 0x55751714,
>
> reasm queue summary:
>    ssn = 0x345,   tsn = 0x5575170c,   msg_flags = 0x2,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x5575170d,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x5575170e,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x5575170f,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x55751710,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x55751711,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x55751712,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x55751713,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x345,   tsn = 0x55751714,   msg_flags = 0x0,   rmem_len = 0x69c
>
> No gap: the last tsn in the reasm queue matches cumulative_tsn_ack_point = 0x55751714,
>
>
> In our case, I believe the reasm queue looked like this when the renege launched:
>
> In sctp_association 0xffff88041b845000
>
>        base_tsn = 0x936a6d76,
>        cumulative_tsn_ack_point = 0x936a6d75,
>        max_tsn_seen = 0x936a6d79,
>
>    ssn = 0x0,   tsn = 0x936a6d6f,   msg_flags = 0x2,   rmem_len = 0x69c
>    ssn = 0x0,   tsn = 0x936a6d70,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x0,   tsn = 0x936a6d71,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x0,   tsn = 0x936a6d72,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x0,   tsn = 0x936a6d73,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x0,   tsn = 0x936a6d74,   msg_flags = 0x0,   rmem_len = 0x69c
>    ssn = 0x0,   tsn = 0x936a6d75,   msg_flags = 0x0,   rmem_len = 0x69c
>
>    ssn = 0x0,   tsn = 0x936a6d78,   msg_flags = 0x0,   rmem_len = 0x628
>
> The gap between 75 and 78 meant that a renege could be launched.
>
> It was launched for tsn = 0x936a6d76 (just arriving and apparently out
> of memory), and the "needed" amount was 0x05ac (1452 bytes).
>
> tsn = 0x936a6d78 was removed, and the amount recovered was 0x538 (1336 bytes).
> The value of "rmem_len" in the event is not what is used to calculated needed
> and freed.
>
> Since 0x538 didn't satisfy 0x5ac, it went for the next one down on the queue
> (tsn = 0x936a6d75)
> and recovered 0x5ac from it for a total recovery of 0xae4 (2788 bytes).
> So because the first post-gap fragment happened to be a LAST_FRAG and shorter than
> the rest of them, it wasn't enough to satisfy the request and we moved on
> to the one that caused the BUG.
>
> If there had been two gapped frags, or if the gapped frag had been another
> middle one that was big enough to satisfy the request, it would not have
> been caught freeing a fragment that was at the cumulative tsn ack point.
> ========================================================================
>
> Since the base_tsn and cumulative_tsn_ack_point are advanced in
> sctp_ulpevent_make_rcvmsg() before putting the fragments on the
> reasm queue, the renege code should not be allowed to dip below
> that point in sctp_ulpq_renege_list().   Otherwise, you're
> discarding undelivered data that you've already reported as
> "delivered" to the sender, right?
>
> Thanks,
> Bob Montgomery
>
>
>
>

��.n��������+%������w��{.n�����{������ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f