Re: [EXTERNAL] Re: What throughput is reasonable?

David Woodhouse <dwmw2@xxxxxxxxxxxxx> · Fri, 05 Apr 2019 17:08:34 +0100

On Fri, 2019-04-05 at 14:54 +0100, David Woodhouse wrote:
> 
> 
> $ sudo perf record -a ./netperf -H 172.16.0.1 -t UDP_STREAM -- -m 1400 
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages                
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
> 
> 212992    1400   10.00     1198093      0    1341.86
> 212992           10.00     1198044           1341.80
> 
> At this point netperf is taking 100% of CPU, doing this:
> 
> Samples: 49K of event 'cycles', Event count (approx.): 32648308583
> Overhead  Command          Shared Object        Symbol
>   31.59%  netperf          [kernel.vmlinux]     [k] sha_transform
>   17.49%  netperf          [kernel.vmlinux]     [k] _aesni_enc1
>    2.88%  netperf          [kernel.vmlinux]     [k] _raw_spin_lock

And now with OpenConnect on the sending end:

$ ./netperf -H 172.16.0.2 -t UDP_STREAM -- -m 1400
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.2 () port 0 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1400   10.00     8132570      0    9108.46
212992           10.00     1478093           1655.46

Both openconnect and netperf are eating 100% of CPU:

Overhead  Command          Shared Object             Symbol
  12.05%  lt-openconnect   libgnutls.so.30.23.2      [.] sha1_block_data_order_ssse3
   7.34%  lt-openconnect   libgnutls.so.30.23.2      [.] aesni_cbc_encrypt
   3.34%  swapper          [kernel.vmlinux]          [k] intel_idle
   3.21%  netperf          [kernel.vmlinux]          [k] entry_SYSCALL_64
   2.93%  netperf          [kernel.vmlinux]          [k] csum_partial_copy_generic

In the opposite direction (receiving to OpenConnect) I get

$  ./netperf -H 172.16.0.1 -t UDP_STREAM -- -m 1400
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1400   10.00     1258914      0    1409.97
212992           10.00     1255689           1406.36

This time, OpenConnect is using a whole CPU but netserver is only about 20%.

Overhead  Command          Shared Object            Symbol
  14.58%  lt-openconnect   libgnutls.so.30.23.2     [.] sha1_block_data_order_ssse3
  10.88%  swapper          [kernel.vmlinux]         [k] intel_idle
   2.98%  lt-openconnect   [kernel.vmlinux]         [k] entry_SYSCALL_64
   2.78%  lt-openconnect   [kernel.vmlinux]         [k] copy_user_enhanced_fast_string
   2.61%  lt-openconnect   libnettle.so.6.5         [.] _nettle_sha1_compress
   2.32%  netserver        [kernel.vmlinux]         [k] csum_partial_copy_generic
   2.20%  lt-openconnect   [kernel.vmlinux]         [k] syscall_return_via_sysret
   2.03%  lt-openconnect   libgnutls.so.30.23.2     [.] aesni_cbc_encrypt

In either direction, on these Skylake boxes it seems to perform at
least as well as the kernel implementation and has no trouble doing
1Gb/s.

Tony, can you do the equivalent experiments on your own setup? I'm
attaching my espsetup.sh again now that I have it working, along with
the fake responses....

You'll need to frob the 'SRC' and 'DST' values. On the 'DST' box, run
espsetup.sh and then run (and leave running) esplisten.pl. In a copy of
openconnect's tests/certs/ directory run:
 openssl s_server -accept 8443 -crlf -cert server-cert.pem -key server-key.pem 

On the client 'SRC' box, run:
 sudo ip tuntap add mode tun user $LOGNAME
 sudo ip link set tun0 up
 sudo ifconfig tun0 172.16.0.1 pointopoint 172.16.0.2
 openconnect 10.0.186.131:8443 --protocol gp -C asd --servercert pin-sha256:xp3scfzy3rOQsv9NcOve/8YVVv+pHr4qNCXEXrNl5s8=  --dtls-local-port=8443 -i tun0 -s /bin/true

Then paste the contents of gpconf.xml into the running s_server,
and (assuming I typed all that from memory correctly) OpenConnect
should be happy:

 POST https://10.0.186.131:8443/ssl-vpn/hipreportcheck.esp
 Connected as 172.16.0.1, using SSL, with ESP in progress
 SIOCSIFMTU: Operation not permitted
 ESP session established with server
 ESP tunnel connected; exiting HTTPS mainloop.

You can also run the 'espsetup.sh' and 'esplisten.pl' on the client
box, to see the kernel←→kernel performance in your setup.

At this point, I appear to be doing better than the kernel is. Or at
least, GnuTLS/nettle are doing better than the kernel. I can't take
much credit for that except trying to get my code out of the way.

I do still see that when sending through OpenConnect, there's a
discrepancy between what we *think* we've sent, and what actually
reaches the other side. Those appear to precisely match the number of
packets which are reported as 'dropped' on the tun0 interface.

I wonder if implementing BQL on the tun device will help with that, so
that packets are still attributed to the sending socket when they're
queued for the tun device.

Attachment:
espsetup.sh

Description: application/shellscript
Attachment:
gpconf.xml

Description: XML document
Attachment:
esplisten.pl

Description: Perl program
Attachment:
smime.p7s

Description: S/MIME cryptographic signature
_______________________________________________
openconnect-devel mailing list
openconnect-devel@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/openconnect-devel