Re: 200ms delays with SCTP streaming data

Corey Minyard <minyard@xxxxxxx> · Tue, 14 Jul 2020 08:10:18 -0500

On Tue, Jul 14, 2020 at 07:12:58AM -0500, Corey Minyard wrote:
> On Mon, Jul 13, 2020 at 07:11:04PM -0300, Marcelo Leitner wrote:
> > On Mon, Jul 13, 2020 at 04:59:07PM -0500, Corey Minyard wrote:
> > > Hi, it's me again with another strange issue.  In case you didn't figure
> > > it out before, I'm working on a library that supports all different
> > > types of stream I/O, and SCTP is one supported building block.  I
> > > noticed when I stacked a multiplexer layer on top of SCTP I started
> > > getting timeouts occasionally.  It took a bit, but I finally realized
> > > that I was getting 200ms delays occasionally between sending a packet
> > > and receiving a packet.  I verified this with a trace right at the
> > > sctp_send() and sctp_recvmsg() calls.  It doesn't seem to be regular
> > > in any way I can see, but it happens often enough to cause issues.
> > > 
> > > If I replace the SCTP block with a TCP block, it works fine, and pretty
> > > much all the code is the same except where it does the read and write
> > > calls (including the epoll() usage, and I have also switched to select()
> > > and it has the same issue).  The write calls don't seem to be the issue,
> > > I see two back-to-back writes a few microseconds apart and see a 200ms
> > > delay between the messages on the receive side.
> > > 
> > > The test in question sets up two connections and does a big simultaneous
> > > bidirectional transfer.  The test app has 4 threads waiting on epoll()
> > > handling data and writing data.
> > > 
> > > And the delay is always ~200ms.  Which sounds suspicious.
> > 
> > That can be the delayed sack timer, in kernel.
> > /* Delayed sack timer - 200ms */
> > #define SCTP_DEFAULT_TIMEOUT_SACK       (200)
> > 
> > You may tweak the sysctl net.sctp.sack_timeout and see if changes
> > accordingly, or via SCTP_PEER_ADDR_PARAMS or even enable immediate ack
> > (by setting SPP_SACKDELAY_DISABLE)
> 
> Ok, setting SPP_SACKDELAY_DISABLE does make the problem go away.
> 
> > 
> > > 
> > > It's not using sctp_sendv() at the moment, as the systems I'm running on
> > > don't have that yet.  But the library does have support if it sees it is
> > > available.
> > > 
> > > So I don't think it's my library; I've stared at it a bunch (and found a
> > > few other bugs) but I can't reconcile this one.  There are no timers
> > 
> > Nice.
> > 
> > > that would cause this in the code in question.  Just basically an
> > > epoll() call waiting on data and receive processing that is comparing
> > > data, along with write processing that is sending the same data.
> > > 
> > > Anyway, I haven't tried to create a small reproducer; I thought I would
> > > report it first and see if anything rang a bell.  I tried this on a
> > > recent kernel and got the same issue.
> > 
> > I guess a combination of xmit rate, msg and buffer sizes and packet
> > drops can lead to this delay. I've seen it happening, but didn't have
> > the time to track it down back then.
> 
> There should be no packet drops.  It's running across localhost, and
> flow control in the multiplex layer as it's set up for the tests limits
> the maximum outstanding data to 1024 bytes.  With overhead and flow
> control messages it's maybe 1050 bytes of data that would ever be
> unacked.  (It's not really testing throughput, it's testing the inner
> workings of the multiplexing protocol.)
> 
> If I understand this correctly per the RFCs, by default if there are two
> messages outstanding, it will send an sack immediately.  Otherwise it
> waits 200ms.  I'm guessing what is happening is that that SCTP sends a
> sack and then receives one more message and the upper layer stops
> because of flow control.  Then the sack comes out in 200ms and things
> continue.

Actually, that still doesn't make sense.  The lack of a sack shouldn't
keep anything from sending unless the congestion window is closed, which
shouldn't happen in this case.  Am I correct?

-corey

> 
> So I think I can figure out how to make this work smoothly.  I assume
> this doesn't happen in TCP because all packets carry an ack, and there
> is data flowing both ways all the time.
> 
> Thanks,
> 
> -corey
> 
> > 
> > That said, remember that Linux SCTP doesn't support buffer
> > auto-tuning. So considering you're running a stress test, you probably
> > want to adjust them accordingly manually to avoid packet drops.
> > 
> >   Marcelo
> > 
> > > 
> > > The library is at https://github.com/cminyard/gensio.  I'd need to
> > > provide a patch for the tracing.
> > > 
> > > -corey