On Fri, 2013-06-28 at 09:59 -0500, John McMonagle wrote: > On Friday, June 28, 2013 04:54:12 am Nicolas Sebrecht wrote: > > The 27/06/13, John McMonagle wrote: > > > Running traffic shapping both in and out. > > > Creating ptp connections via openvpn. > > > Route the tunnels with ospf. > > > > > > Having a problem with outgoing traffic shapping. > > > txqueuelen on the tunnels is normally 100. > > > At that setting have horrible latency at times. > > > If I lower txqueuelen it keeps latency under control but end up with > > > excessive packet loss. > > > > > > The more I think about it putting another queue before the traffic > > > shapping creates an unsolvable problem. > > > I'm tempted to try ipsec and gre tunnels but suspect the same problem > > > will be the same. > > > > > > How about adding traffic shapping into the tunnels? > > > I have 5 tunnels how would one get the tunnel shapping work with the > > > shapping on the outgong interface? > > > > > > Any suggestions? > > > > I have enabled htb with sfq on a router providing 8 openvpn tunnels. I > > made it using the "up" option in the configuration file of each VPN. It > > allows to load a shell script (hook) once the TUN device is created by > > openvpn. The script just apply the QoS on the TUN device of the tunnel. > > > > I guess something very similar can be done on the client side if ever > > needed. > > Nicolas > > Not that it's relevant but my tunnels are always up. > Can traffic shape but the input to the tunnels need to set a fixed outgoing > bandwidth. > I suspect if I set all to 1/2 of the full bandwidth it would help a little. > Would be ideal if the tunnel interface could be traffic shaped on one of the > sub queues of the outgoing interface's traffic shaping. > I'm sure I have the some of the terminology messed up ;-( > > I noticed that if I create a gre interface there are no transmit buffers. > If a gre interface has no buffers maybe that would help? > <snip> I'm not sure that it solves your problem but here are my notes from how we handled it: Traffic shaping with VPN presents some challenges. Some VPN technologies such as OpenVPN and KLIPS create virtual interfaces. The traffic from these interfaces must be pooled with the traffic on the physical interfaces for traffic shaping. Moreover, the traffic cannot be double counted, e.g., if an OpenVPN packet comes in on eth1 on UDP port 1194 and then appears unencrytped as an SSH packet on interface tun0, how much bandwidth has that consumed for our HFSC service curve calculations. There is a similar problem with netkey because the same traffic passes through the same interface twice - once unencrypted and then encrypted. This also creates a challenge regarding visibility as sometimes the unencrypted contents are not visible and thus cannot be classified. The problems are slightly different between egress and ingress traffic shaping. We ultimately found we could not use the most efficient form of classification, CONNMARK, but that is just as well as we cannot use it on some devices, e.g., Endians use all the available marks internally leaving none available for us. Egress VPN Traffic Shaping We will use an IFB interface to coalesce the traffic from the various interfaces to a single queue. This implies that we need to create a placeholder PRIO QDISC for each interface including the physical interface so that we can apply the redirecting filter to the interface. We can use a two band queue and send all traffic to the first band from which it is redirected into the IFB interface, e.g., tc qdisc replace dev eth1 root handle 2: prio bands 2 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tc filter replace dev eth1 parent 2:0 protocol ip prio 1 u32 match u8 0 0 flowid 2:1 action mirred egress redirect dev ifb1 We then create an HFSC QDISC on ifb1 with appropriate classes. The challenge is visibility on the IFB interface. Traffic redirected from tun+ (and probably ipsec+ although we did not test that), has not yet been encrypted so we could potentially examine the packet. However, the netkey traffic appears to be already encrypted when it reaches the IFB interface foiling any tc filter based classification. We also tried using CONNMARK to mark the connection and then restore it for each packet. In fact, this would be our preference as it is the lowest overhead solution but it failed. Perhaps the mark is not preserved when the packet is encrypted. I have asked on the Linux net-dev list but have not received a response. The only thing that worked was the iptables CLASSIFY target. To avoid creating the same rule for every different interface, we created a user defined ESHAPE (Egress SHAPE) chain and jumped all traffic going out on the physical interface or on the virtual interfaces (e.g., tun0) to it, e.g., iptables -t mangle -N ESHAPE iptables -t mangle -A POSTROUTING -o eth1 -j ESHAPE iptables -t mangle -A POSTROUTING -o tun+ -j ESHAPE iptables -t mangle -A ESHAPE -p 6 --sport 82 -j CLASSIFY --set-class 1:10 iptables -t mangle -A ESHAPE -p 6 --sport 443 -j CLASSIFY --set-class 1:10 We had some concern that the encapsulated traffic, e.g., ESP or UDP port 1194, would be classified into the default HFSC queue and drag down any prioritized traffic but this does not appear to be the case. We tested by having only rt curves, setting the default one much lower than the prioritized curve, and sending prioritized traffic through the tunnel; it all passed at the prioritized rate. Ingress VPN Traffic Shaping The issues were quite different on ingress. We similarly created an IFB interface to coalesce the traffic but visibility was not a problem. The IFB interface saw the unencrypted traffic all the time. However, we could not use the CLASSIFY target since it cannot be used in the mangle table PREROUTING chains. We could not use packet marking since the packets arrive on the IFB interface before they have been marked. Thus, the only option was tc filters and generally complicated linked filters so that we accommodate the rare case where IP options are used thus throwing off the calculation of the TCP packet offsets (because the IP header then becomes 24 rather than 20 bytes). We also have a problem that the netkey packets are placed in the default HFSC queue and can drag down any decrypted priority traffic. Thus, we need to create a separate, high service queue for the encapsulated traffic. This does not appear to be necessary for tun+ traffic and, we assume, ipsec+ traffic. Since ingress traffic shaping works on back pressure, a high speed queue for encrypted traffic should not create a problem. In other words, even if we accept encapsulated traffic at a higher rate than we want, the outflow of decrypted traffic to the internal network is constrained by the rest of the HFSC queue forcing packet drops of excessive traffic which should slow down the sending stream thus regulating the encrypted packets as well as the decrypted packets. We thought about using this principle of back pressure to move the traffic shaping to the egress of the various internal interfaces but that would have still required an IFB interface to coalesce the traffic and would have required redirects for every interface. This, an ingress filter seems more efficient. Sample test script #!/bin/sh modprobe ifb ifconfig ifb0 up ifconfig ifb1 up tc qdisc replace dev eth1 root handle 2: prio bands 2 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tc qdisc replace dev tun0 root handle 3: prio bands 2 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tc qdisc replace dev ifb1 root handle 1 hfsc default 20 tc class replace dev ifb1 parent 1:0 classid 1:1 hfsc ul rate 100000kbit ls rate 100000kbit tc class replace dev ifb1 parent 1:1 classid 1:20 hfsc rt rate 150kbit #ls rate 40000kbit tc class replace dev ifb1 parent 1:1 classid 1:10 hfsc rt rate 500kbit #ls rate 50000kbit tc class replace dev ifb1 parent 1:1 classid 1:30 hfsc sc rate 10000kbit tc qdisc replace dev ifb1 parent 1:20 handle 1201 sfq tc qdisc replace dev ifb1 parent 1:10 handle 1101 sfq tc qdisc replace dev ifb1 parent 1:30 handle 1301 sfq iptables -t mangle -N ESHAPE iptables -t mangle -A POSTROUTING -o eth1 -j ESHAPE iptables -t mangle -A POSTROUTING -o tun+ -j ESHAPE iptables -t mangle -A ESHAPE -p 6 --sport 82 -j CLASSIFY --set-class 1:10 iptables -t mangle -A ESHAPE -p 6 --sport 443 -j CLASSIFY --set-class 1:10 iptables -t mangle -A ESHAPE -p 6 --sport 822 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --dport 822 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --tcp-flags SYN,RST,ACK,FIN ACK -m length --length 20:43 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --sport 53 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --dport 53 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --sport 500 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --dport 500 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --sport 4500 -j CLASSIFY --set-class 1:30 iptables -t mangle -A ESHAPE -p 6 --dport 4500 -j CLASSIFY --set-class 1:30 tc qdisc replace dev ifb0 root handle 4 hfsc default 20 tc class replace dev ifb0 parent 4:0 classid 4:1 hfsc ul rate 100000kbit ls rate 100000kbit tc class replace dev ifb0 parent 4:1 classid 4:20 hfsc rt rate 150kbit #ls rate 40000kbit tc class replace dev ifb0 parent 4:1 classid 4:10 hfsc rt rate 500kbit #ls rate 50000kbit tc class replace dev ifb0 parent 4:1 classid 4:30 hfsc sc rate 10000kbit tc class replace dev ifb0 parent 4:1 classid 4:40 hfsc rt rate 100000kbit tc qdisc replace dev ifb0 parent 4:20 handle 4201 sfq tc qdisc replace dev ifb0 parent 4:10 handle 4101 sfq tc qdisc replace dev ifb0 parent 4:30 handle 4301 sfq tc qdisc replace dev ifb0 parent 4:40 handle 4401 sfq tc filter replace dev ifb0 parent 4:0 protocol ip prio 1 u32 match ip protocol 50 0xff flowid 4:40 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 handle 16: u32 divisor 1 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 match ip protocol 6 0xff link 16: offset at 0 mask 0x0f00 shift 6 plus 0 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp dst 822 0xffff at nexthdr+2 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp src 822 0xffff at nexthdr+0 flowid 4:30 # Send packets <64 bytes (u16 0 0xffc0 at 2) with only the ACK flag set (match u8 16 0xff at nexthdr+13) to the low latency queue tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match u16 0 0xffc0 at 2 match u8 16 0xff at nexthdr+13 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp src 443 0xffff at nexthdr+0 flowid 4:10 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp src 82 0xffff at nexthdr+0 flowid 4:10 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 handle 117: u32 divisor 1 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 match ip protocol 17 0xff link 117: offset at 0 mask 0x0f00 shift 6 plus 0 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp dst 53 0xffff at nexthdr+2 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp src 53 0xffff at nexthdr+0 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp dst 500 0xffff at nexthdr+2 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp src 500 0xffff at nexthdr+0 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp dst 4500 0xffff at nexthdr+2 flowid 4:30 tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp src 4500 0xffff at nexthdr+0 flowid 4:30 ip link set eth1 txqueuelen 10 ip link set tun0 txqueuelen 10 ethtool -K eth1 gso off gro off ethtool -K eth0 gso off gro off ethtool -K eth2 gso off gro off tc filter replace dev eth1 parent 2:0 protocol ip prio 1 u32 match u8 0 0 flowid 2:1 action mirred egress redirect dev ifb1 tc filter replace dev tun0 parent 3:0 protocol ip prio 1 u32 match u8 0 0 flowid 3:1 action mirred egress redirect dev ifb1 tc qdisc replace dev eth1 ingress tc filter replace dev eth1 parent ffff: protocol ip prio 1 u32 match u8 0 0 action mirred egress redirect dev ifb0 tc qdisc replace dev tun0 ingress tc filter replace dev tun0 parent ffff: protocol ip prio 1 u32 match u8 0 0 action mirred egress redirect dev ifb0 Note that we have prioritized the IKE packets used to manage the IPSec connections. -- To unsubscribe from this list: send the line "unsubscribe lartc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html