On Fri, 2019-04-05 at 14:54 +0100, David Woodhouse wrote: > > > $ sudo perf record -a ./netperf -H 172.16.0.1 -t UDP_STREAM -- -m 1400 > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.1 () port 0 AF_INET > Socket Message Elapsed Messages > Size Size Time Okay Errors Throughput > bytes bytes secs # # 10^6bits/sec > > 212992 1400 10.00 1198093 0 1341.86 > 212992 10.00 1198044 1341.80 > > At this point netperf is taking 100% of CPU, doing this: > > Samples: 49K of event 'cycles', Event count (approx.): 32648308583 > Overhead Command Shared Object Symbol > 31.59% netperf [kernel.vmlinux] [k] sha_transform > 17.49% netperf [kernel.vmlinux] [k] _aesni_enc1 > 2.88% netperf [kernel.vmlinux] [k] _raw_spin_lock And now with OpenConnect on the sending end: $ ./netperf -H 172.16.0.2 -t UDP_STREAM -- -m 1400 MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.2 () port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 1400 10.00 8132570 0 9108.46 212992 10.00 1478093 1655.46 Both openconnect and netperf are eating 100% of CPU: Overhead Command Shared Object Symbol 12.05% lt-openconnect libgnutls.so.30.23.2 [.] sha1_block_data_order_ssse3 7.34% lt-openconnect libgnutls.so.30.23.2 [.] aesni_cbc_encrypt 3.34% swapper [kernel.vmlinux] [k] intel_idle 3.21% netperf [kernel.vmlinux] [k] entry_SYSCALL_64 2.93% netperf [kernel.vmlinux] [k] csum_partial_copy_generic In the opposite direction (receiving to OpenConnect) I get $ ./netperf -H 172.16.0.1 -t UDP_STREAM -- -m 1400 MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.1 () port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 1400 10.00 1258914 0 1409.97 212992 10.00 1255689 1406.36 This time, OpenConnect is using a whole CPU but netserver is only about 20%. Overhead Command Shared Object Symbol 14.58% lt-openconnect libgnutls.so.30.23.2 [.] sha1_block_data_order_ssse3 10.88% swapper [kernel.vmlinux] [k] intel_idle 2.98% lt-openconnect [kernel.vmlinux] [k] entry_SYSCALL_64 2.78% lt-openconnect [kernel.vmlinux] [k] copy_user_enhanced_fast_string 2.61% lt-openconnect libnettle.so.6.5 [.] _nettle_sha1_compress 2.32% netserver [kernel.vmlinux] [k] csum_partial_copy_generic 2.20% lt-openconnect [kernel.vmlinux] [k] syscall_return_via_sysret 2.03% lt-openconnect libgnutls.so.30.23.2 [.] aesni_cbc_encrypt In either direction, on these Skylake boxes it seems to perform at least as well as the kernel implementation and has no trouble doing 1Gb/s. Tony, can you do the equivalent experiments on your own setup? I'm attaching my espsetup.sh again now that I have it working, along with the fake responses.... You'll need to frob the 'SRC' and 'DST' values. On the 'DST' box, run espsetup.sh and then run (and leave running) esplisten.pl. In a copy of openconnect's tests/certs/ directory run: openssl s_server -accept 8443 -crlf -cert server-cert.pem -key server-key.pem On the client 'SRC' box, run: sudo ip tuntap add mode tun user $LOGNAME sudo ip link set tun0 up sudo ifconfig tun0 172.16.0.1 pointopoint 172.16.0.2 openconnect 10.0.186.131:8443 --protocol gp -C asd --servercert pin-sha256:xp3scfzy3rOQsv9NcOve/8YVVv+pHr4qNCXEXrNl5s8= --dtls-local-port=8443 -i tun0 -s /bin/true Then paste the contents of gpconf.xml into the running s_server, and (assuming I typed all that from memory correctly) OpenConnect should be happy: POST https://10.0.186.131:8443/ssl-vpn/hipreportcheck.esp Connected as 172.16.0.1, using SSL, with ESP in progress SIOCSIFMTU: Operation not permitted ESP session established with server ESP tunnel connected; exiting HTTPS mainloop. You can also run the 'espsetup.sh' and 'esplisten.pl' on the client box, to see the kernel←→kernel performance in your setup. At this point, I appear to be doing better than the kernel is. Or at least, GnuTLS/nettle are doing better than the kernel. I can't take much credit for that except trying to get my code out of the way. I do still see that when sending through OpenConnect, there's a discrepancy between what we *think* we've sent, and what actually reaches the other side. Those appear to precisely match the number of packets which are reported as 'dropped' on the tun0 interface. I wonder if implementing BQL on the tun device will help with that, so that packets are still attributed to the sending socket when they're queued for the tun device.
Attachment:
espsetup.sh
Description: application/shellscript
Attachment:
gpconf.xml
Description: XML document
Attachment:
esplisten.pl
Description: Perl program
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ openconnect-devel mailing list openconnect-devel@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/openconnect-devel