Re: [EXTERNAL] Re: What throughput is reasonable?

David Woodhouse <dwmw2@xxxxxxxxxxxxx> · Mon, 25 Mar 2019 09:54:33 +0000

On Mon, 2019-03-25 at 11:41 +0200, Daniel Lenski wrote:
> On Mon, Mar 25, 2019 at 10:29 AM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> > 
> > On Sun, 2019-03-24 at 19:13 +0200, Daniel Lenski wrote:
> > > 
> > > Do I have this right? High packet loss from client→VPN, low packet
> > > loss from VPN→client?
> > > 
> > > If so, I'm guessing your problems are MTU-related.
> > 
> > Hm, wouldn't we expect that to be more consistent? If the full-sized
> > packets are getting lost, that would just stall and not lose the
> > *occasional* packet?
> 
> Yeah… should be. My guess is based on a couple of previous
> less-detailed reports from users of earlier versions with GP.
> 
> > If it really is a repeatable drop every N packets, I might be inclined
> > to look at sequence numbers and epoch handling. Are we doing any ESP
> > rekeying?
> 
> We are rekeying, but only using the most naïve "tunnel rekey" method.
> AFAIK, that's all that GP supports.
> https://gitlab.com/openconnect/openconnect/blob/v8.02/gpst.c#L1153-1157
> 
> After a certain time has elapsed, we tear down the TLS connection and
> reconnect (using the same auth cookie), which also invalidates the
> previous ESP keys and requires us to start using new ones. We should
> handle late incoming packets using the "old" ESP keys correctly, using
> the same method as with Juniper.

We might handling late incoming packets correctly, but we stop actually
sending them. I wonder if we should continue to send ESP packets on the
"old" connection even while we're doing the reconnect?

But a reconnect/rekey would be clearly visible in OpenConnect output.
Tony, presumably you'd have seen that and mentioned it?

Also, you said that you hit this a repeatable 4142 packets into a TCP
connection? That was regardless of how long the VPN had been up?

One possible explanation is that OpenConnect is actually *faster* than
the other client, and it hitting buffer overflows elsewhere in the
network. You could actually try *throttling* it to the maximum
bandwidth it actually does manage to achieve, and see if that helps?
We have no throttling control; you'd have to do something naïve in the
code that reads from the tun device, counting bytes/second and
declining to read any more when it's had enough).

Attachment:
smime.p7s

Description: S/MIME cryptographic signature
_______________________________________________
openconnect-devel mailing list
openconnect-devel@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/openconnect-devel