On Thu, 2019-04-18 at 21:16 +0000, Phillips, Tony wrote: > Also of note is that when you do your tests, you’re getting CPU/bound > at 100%. I never paid much attention to overall CPU until I saw your > stats last week. > > We can’t get OC to run over maybe 35%. T3.2xlarge on a fully contended host. Even with it set to unlimited CPU credits, I do see some steal time which isn't entirely unexpected. Also a bunch of hardirq and softirq time. I've pinned it to CPU#0 (taskset -c 0 ./openconnect …) here to make it easier to see what's going on (press '1' in top to see per-cpu stats): top - 22:35:57 up 31 min, 3 users, load average: 0.88, 0.45, 0.23 Tasks: 132 total, 2 running, 130 sleeping, 0 stopped, 0 zombie %Cpu0 : 37.3 us, 45.1 sy, 0.0 ni, 1.0 id, 0.0 wa, 4.9 hi, 3.9 si, 7.8 st %Cpu1 : 1.0 us, 1.0 sy, 0.0 ni, 97.1 id, 0.0 wa, 0.0 hi, 0.0 si, 1.0 st %Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 76.2 us, 21.8 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 0.0 si, 1.0 st MiB Mem : 31812.8 total, 31401.2 free, 213.4 used, 198.2 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 31242.6 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1255 fedora 20 0 163372 2888 1672 S 99.0 0.0 0:06.21 iperf 1119 fedora 20 0 31804 13728 10888 R 83.0 0.0 2:07.91 lt-openconnect $ iperf -u -c 172.16.0.2 -l 1400 -b 2000M ------------------------------------------------------------ Client connecting to 172.16.0.2, UDP port 5001 Sending 1400 byte datagrams, IPG target: 5.34 us (kalman adjust) UDP buffer size: 208 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.0.1 port 39160 connected with 172.16.0.2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 2.44 GBytes 2.10 Gbits/sec [ 3] Sent 1872458 datagrams [ 3] Server Report: [ 3] 0.0-10.0 sec 1.54 GBytes 1.32 Gbits/sec 0.004 ms 691707/1872458 (37%) Bizarrely I am sometimes seeing double-digit percentages of idle time even when it should be 100% busy shifting packets. Don't know if it's repeatable but it may have gone away when I fixed the poll() in the middle of a busy mainloop. diff --git a/mainloop.c b/mainloop.c index c44bed8d..1ca71587 100644 --- a/mainloop.c +++ b/mainloop.c @@ -239,7 +239,6 @@ int openconnect_mainloop(struct openconnect_info *vpninfo, if (vpninfo->quit_reason) break; - poll_cmd_fd(vpninfo, 0); if (vpninfo->got_cancel_cmd) { if (vpninfo->cancel_type == OC_CMD_CANCEL) { vpninfo->quit_reason = "Aborted by caller"; @@ -309,6 +308,7 @@ int openconnect_mainloop(struct openconnect_info *vpninfo, udp_r = FD_ISSET(vpninfo->dtls_fd, &rfds); if (vpninfo->ssl_fd >= 0) tcp_r = FD_ISSET(vpninfo->ssl_fd, &rfds); + check_cmd_fd(vpninfo, &rfds); #endif } This was showing up high on the profile if you dig into esp_mainloop() too, and although it's mostly in the noise it annoyed me. I think this is better. We should really switch to a ring for these anyway. diff --git a/openconnect-internal.h b/openconnect-internal.h index d998f5d8..a9880b1a 100644 --- a/openconnect-internal.h +++ b/openconnect-internal.h @@ -304,8 +304,9 @@ static inline struct pkt *dequeue_packet(struct pkt_q *q) if (ret) { q->head = ret->next; - if (!--q->count) + if (q->tail == &ret->next) q->tail = &q->head; + q->count--; } return ret; }
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ openconnect-devel mailing list openconnect-devel@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/openconnect-devel