Re: SCTP Hang in reorder queue

Vlad Yasevich <vyasevich@xxxxxxxxx> · Tue, 12 Feb 2013 20:06:57 -0500

On 02/12/2013 06:22 PM, Bob Montgomery wrote:
Below, I've copied the discussion of the reorder lobby hang caused by
the arrival of a duplicate event tsn.  It was originally presented in
the thread about renege bugs, but this hang is not a renege problem.

Here are the conditions that cause a tsn mark to be removed from the
tsn map while we are still holding that event in the lobby:

With a base_tsn of 1000, here is the state of the map after the arrival
of these tsns:
1001
1002
1003
1004
1063
1064
1129

bobm@ned4g8:~/Examples$ testtsn <tsnfile2
mark 1001
tsnmap len: 64, base_tsn: 1000, max_tsn_seen: 1001, cum_ack: 999
  0000000000000002

mark 1002
tsnmap len: 64, base_tsn: 1000, max_tsn_seen: 1002, cum_ack: 999
  0000000000000006

mark 1003
tsnmap len: 64, base_tsn: 1000, max_tsn_seen: 1003, cum_ack: 999
  000000000000000e

mark 1004
tsnmap len: 64, base_tsn: 1000, max_tsn_seen: 1004, cum_ack: 999
  000000000000001e

mark 1063
tsnmap len: 64, base_tsn: 1000, max_tsn_seen: 1063, cum_ack: 999
  800000000000001e

^^^^^ Bit 63 set here

mark 1064
tsnmap len: 128, base_tsn: 1000, max_tsn_seen: 1064, cum_ack: 999
  800000000000001e 0000000000000001

^^^^^ Bit 64 causes expansion

mark 1129
tsnmap len: 256, base_tsn: 1000, max_tsn_seen: 1129, cum_ack: 999
  800000000000001e 0000000000000000 0000000000000002 0000000000000000

^^^^^ Bit 64 disappears when bit 129 is added

Note that after the arrival of 1064 (bit 64 in the map), the map is
expanded to 128 and bit 64 is filled in.

When 1129 arrives, the map is expanded in sctp_tsnmap_grow and the old
part of the map is copied to the new map here:

         new = kzalloc(len>>3, GFP_ATOMIC);
         if (!new)
                 return 0;

         bitmap_copy(new, map->tsn_map, map->max_tsn_seen - map->base_tsn);

But the copy is for 64 bits (max_tsn_seen - base_tsn) and it should have been
for 65 bits to pick up bit 64.  This only hurts us when max_tsn_seen is in the
LSB.  For all other cases, the logic in bitmap_copy rounds up to the whole 64-bit
word and gets the max_tsn_seen bit even though we didn't ask for it.

When the max_tsn_seen occupies an LSB in the map, and a subsequent arrival causes
the map to grow, the previous max_tsn_seen bit will be lost.  This leads to the
duplicate arrival and to the hang in the reorder queue.

Should be:
         bitmap_copy(new, map->tsn_map, map->max_tsn_seen - map->base_tsn + 1);

Actually, should be max_tsn_seen - cumulative_tsn_ack_point, but that's 
really the same thing as you have above.

Thanks you for the explanation Bob.  This is indeed a day 1 bug since 
the tsn re-write that happened a while ago and this is tough to hit.

The problem is that base_tsn is bit 0 while max_tsn_seen is the last 
non-zero bit.  Thus the difference between the two always results in the 
off by one copy.

Thanks
-vlad

Which gives this:

mark 1063
tsnmap len: 64, base_tsn: 1000, max_tsn_seen: 1063, cum_ack: 999
  800000000000001e

mark 1064
tsnmap len: 128, base_tsn: 1000, max_tsn_seen: 1064, cum_ack: 999
  800000000000001e 0000000000000001

mark 1129
tsnmap len: 256, base_tsn: 1000, max_tsn_seen: 1129, cum_ack: 999
  800000000000001e 0000000000000001 0000000000000002 0000000000000000

^^^^^ Bit 64 does not disappear

Bob Montgomery

Background material (how a duplicate tsn causes the reorder queue to
hang):
===============================================================
The second hang does not involve the reasm queue.  It occurs on a test
where all the events are non-fragmented.  The final state of the
ulpq lobby is this:

SSN   X
SSN   X+1
SSN   X+2
SSN   X+3
...
SSN   X+last

And the Next expected SSN value in ssnmap is X+1.
So we're waiting on X+1, but X is the first item in the queue.

I think that is caused under these conditions:

Say the lobby queue had:

ssn  10
ssn  10  (duplicate)
ssn  11
ssn  12

and we're waiting on ssn 9...

call sctp_ulpq_order with event ssn 9:
ssn_next is incremented to 10
call sctp_ulpq_retrieve_ordered()
start down the list and find 10.
ssn_next is incremented to 11.
grab ssn 10 off the queue, add to event_list and go around.
find 10 again and it's != new ssn_next(11), so break.

Now we're hung forever.

We built a module with a BUG statement on putting a duplicate
into the lobby and hit it.

The duplicate event was at the end of a group of sequential events,
followed by a gap and then another group of sequential events.
Coincidentally (or not), at the time the duplicate
was sent to the lobby, it was represented by a lone bit in a
word of the tsnmap:

...
   ssn = 0x30d,
   tsn = 0x5505020f,

   ssn = 0x30e,     <<<<<<< About to insert this one again
   tsn = 0x55050210,

Big actual gap

   ssn = 0x378,
   tsn = 0x5505027a,

   ssn = 0x379,
   tsn = 0x5505027b,
...

tsn_map = 0xffff8807aa430b80,
base_tsn = 0x550501d0,

crash-6.0.8bobm> p/x 0x210-0x1d1
$8 = 0x3f
So 63 (0x3f) + 1 = 64 bits set,
then 106 (0x6a) - 1 = 105 bits clear,
then 12 (0xc) + 1 = 13 bits set.

crash-6.0.5bobm> rd 0xffff8807aa430b80 8
ffff8807aa430b80:  fffffffffffffffe 0000000000000001   ................
ffff8807aa430b90:  007ffc0000000000 0000000000000000   ................

fffffffffffffffe   1 off, 63 on
0000000000000001   1 on , 63 off
007ffc0000000000   42 off,  13 on

The lone bit in the second word describes tsn 0x55050210, our duplicate.

--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html