Re: SCTP abort with T-bit set after handshake

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Marcelo,
Sorry for the slow reply, have been away and then have been struggling to reproduce the problem.


> 
> A few lines below it will check if an asoc couldn't be found and will
> increment SCTP_MIB_OUTOFBLUES. There are more places that inc it, but
> it's a start.
> 
> It should show up in netstat -s or /proc/net/sctp/snmp.
> 

Have finally caught another instance of the problem while monitoring the SCTP statistics. 
This is not helped by the fact that the out-of-blue counter goes up in total by about 600 while running a complete set of tests (I assume this is mainly at the end of each test when conections are abruptly terminated).
I have therefore been capturing the stats every 100msec and looking at the counters at the moment when the problem occurred.

This shows the out-of-blue counter being incremented at the same time as the SCTP connection failure.

16:07:12.708
SctpCurrEstab                           22
SctpActiveEstabs                        64619
SctpPassiveEstabs                       64618
SctpAborteds                            108462
SctpShutdowns                           24922
SctpOutOfBlues                          3471
SctpChecksumErrors                      0
SctpOutCtrlChunks                       915680
SctpOutOrderChunks                      708312
SctpOutUnorderChunks                    0
SctpInCtrlChunks                        1314834
SctpInOrderChunks                       704751
SctpInUnorderChunks                     0
SctpFragUsrMsgs                         0
SctpReasmUsrMsgs                        0
SctpOutSCTPPacks                        1489904
SctpInSCTPPacks                         1488886
SctpT1InitExpireds                      108
SctpT1CookieExpireds                    2
SctpT2ShutdownExpireds                  80
SctpT3RtxExpireds                       162
SctpT4RtoExpireds                       0
SctpT5ShutdownGuardExpireds             0
SctpDelaySackExpireds                   54915
SctpAutocloseExpireds                   0
SctpT3Retransmits                       157
SctpPmtudRetransmits                    0
SctpFastRetransmits                     0
SctpInPktSoftirq                        1217809
SctpInPktBacklog                        270941
SctpInPktDiscards                       3483
SctpInDataChunkDiscards                 0

16:07:12.810
SctpCurrEstab                           38
SctpActiveEstabs                        64627
SctpPassiveEstabs                       64627
SctpAborteds                            108463
SctpShutdowns                           24922
SctpOutOfBlues                          3472
SctpChecksumErrors                      0
SctpOutCtrlChunks                       915742
SctpOutOrderChunks                      708342
SctpOutUnorderChunks                    0
SctpInCtrlChunks                        1314899
SctpInOrderChunks                       704781
SctpInUnorderChunks                     0
SctpFragUsrMsgs                         0
SctpReasmUsrMsgs                        0
SctpOutSCTPPacks                        1489978
SctpInSCTPPacks                         1488960
SctpT1InitExpireds                      108
SctpT1CookieExpireds                    2
SctpT2ShutdownExpireds                  80
SctpT3RtxExpireds                       162
SctpT4RtoExpireds                       0
SctpT5ShutdownGuardExpireds             0
SctpDelaySackExpireds                   54920
SctpAutocloseExpireds                   0
SctpT3Retransmits                       157
SctpPmtudRetransmits                    0
SctpFastRetransmits                     0
SctpInPktSoftirq                        1217854
SctpInPktBacklog                        270970
SctpInPktDiscards                       3484
SctpInDataChunkDiscards                 0

> 
> Btw, is this test public? Can I run it too?  

Unfortunately, it is private.


> Or if you can create a
> small reproducer, that would be great.

This would be great if I could figure out what the important elements are in what I am doing.
The tests are opening and closing and aborting large numbers of connections. 
Some of the connections are used to exchange a lot of data, others hardly carry anything.
The connection that fails appears to be fairly random. The timing of when it fails appears to be fairly random.
The failure only occurs after an average of over an hour of running.
Any hints at the kind of behaviour that could trigger a failure like this?

Thanks,
Dave.








--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux