Re: Traffic shapping and vpn tunnel problems

"John A. Sullivan III" <jsullivan@xxxxxxxxxxxxxxxxxxx> · Thu, 20 Feb 2014 20:34:21 -0500



On Fri, 2013-07-12 at 15:41 -0500, John McMonagle wrote:
> On Sunday 30 June 2013 8:09:14 AM John A. Sullivan III wrote:
> > On Fri, 2013-06-28 at 09:59 -0500, John McMonagle wrote:
> > > On Friday, June 28, 2013 04:54:12 am Nicolas Sebrecht wrote:
> > > > The 27/06/13, John McMonagle wrote:
> > > > > Running traffic shapping both in and out.
> > > > > Creating ptp connections via openvpn.
> > > > > Route the tunnels with ospf.
> > > > > 
> > > > > Having a problem with outgoing traffic shapping.
> > > > > txqueuelen on the tunnels is normally 100.
> > > > > At that setting have horrible latency at times.
> > > > > If I lower txqueuelen it keeps latency under control but end up with
> > > > > excessive packet loss.
> > > > > 
> > > > > The more I think about it putting another queue before the traffic
> > > > > shapping creates an unsolvable problem.
> > > > > I'm tempted to try ipsec and gre tunnels but suspect the same problem
> > > > > will be the same.
> > > > > 
> > > > > How about adding traffic shapping into the tunnels?
> > > > > I have 5 tunnels how would one get the tunnel shapping work with the
> > > > > shapping on the outgong interface?
> > > > > 
> > > > > Any suggestions?
> > > > 
> > > > I have enabled htb with sfq on a router providing 8 openvpn tunnels. I
> > > > made it using the "up" option in the configuration file of each VPN. It
> > > > allows to load a shell script (hook) once the TUN device is created by
> > > > openvpn. The script just apply the QoS on the TUN device of the tunnel.
> > > > 
> > > > I guess something very similar can be done on the client side if ever
> > > > needed.
> > > 
> > > Nicolas
> > > 
> > > Not that it's relevant but my tunnels are always up.
> > > Can traffic shape but the input to the tunnels need to set a fixed outgoing 
> > > bandwidth.
> > > I suspect if I set all to  1/2 of the full bandwidth it would help a little.
> > > Would be ideal if the tunnel interface could be  traffic shaped on one of the 
> > > sub queues of the outgoing interface's traffic shaping.
> > > I'm sure I have the some of the terminology messed up ;-(
> > > 
> > > I noticed that if I create a gre interface there are no transmit buffers.
> > > If a gre interface has no buffers maybe that would help?
> > > 
> > <snip>
> > I'm not sure that it solves your problem but here are my notes from how
> > we handled it:
> > 
> 
> John
> 
> Have a partial understanding of what you are doing.
> If your having minimal packet losses with  txqueuelen at 10 you must be doing something right. 
> 
> I'm still a bit confused by using ifb1 on the outgoing interfaces.
> Is there anything that explains how the packets are processed through traffic shaping?
Hi, John.  My apologies for the huge delay; I've been overrun at work! I
need work shaping :)

If I recall correctly, the idea of using ifb1 on outgoing interfaces was
so that we could use the same traffic shaping rules on multiple
interfaces.  For example, we might have the physical interface eth0 and
a tun interface tun0 for OpenVPN or an ipsec0 interface if we are using
the old KLIPS approach to IPSec.  The traffic shaping rules which apply
to eth0 would not apply to the traffic on tun0 because it is already
encapsulated.  Thus, we need to apply the same rules we would apply to
eth0 to tun0 so that the traffic is shaped before encapsulation.  
We could write and maintain two identical rule sets but it is easier to
maintain one rule set, apply it to ifb1, and direct both eth0 and tun0
into ifb1.  The packets come into eth0 or tun0, are immediately place
onto ifb1 where they are shaped and then returned to eth0 or tun0 right
where they departed to be acted upon by netfilter.  I hope that helps -
John
> 
> > Traffic shaping with VPN presents some challenges.  Some VPN
> > technologies such as OpenVPN and KLIPS create virtual interfaces.  The
> > traffic from these interfaces must be pooled with the traffic on the
> > physical interfaces for traffic shaping.  Moreover, the traffic cannot
> > be double counted, e.g., if an OpenVPN packet comes in on eth1 on UDP
> > port 1194 and then appears unencrytped as an SSH packet on interface
> > tun0, how much bandwidth has that consumed for our HFSC service curve
> > calculations.
> > There is a similar problem with netkey because the same traffic passes
> > through the same interface twice - once unencrypted and then encrypted.
> > This also creates a challenge regarding visibility as sometimes the
> > unencrypted contents are not visible and thus cannot be classified.  The
> > problems are slightly different between egress and ingress traffic
> > shaping.
> > We ultimately found we could not use the most efficient form of
> > classification, CONNMARK, but that is just as well as we cannot use it
> > on some devices, e.g., Endians use all the available marks internally
> > leaving none available for us.
> > Egress VPN Traffic Shaping
> > We will use an IFB interface to coalesce the traffic from the various
> > interfaces to a single queue.  This implies that we need to create a
> > placeholder PRIO QDISC for each interface including the physical
> > interface so that we can apply the redirecting filter to the interface.
> > We can use a two band queue and send all traffic to the first band from
> > which it is redirected into the IFB interface, e.g., 
> > tc qdisc replace dev eth1 root handle 2: prio bands 2 priomap 0 0 0 0 0
> > 0 0 0 0 0 0 0 0 0 0 0
> > tc filter replace dev eth1 parent 2:0 protocol ip prio 1 u32 match u8 0
> > 0 flowid 2:1 action mirred egress redirect dev ifb1
> > We then create an HFSC QDISC on ifb1 with appropriate classes.  The
> > challenge is visibility on the IFB interface.  Traffic redirected from
> > tun+ (and probably ipsec+ although we did not test that), has not yet
> > been encrypted so we could potentially examine the packet.  However, the
> > netkey traffic appears to be already encrypted when it reaches the IFB
> > interface foiling any tc filter based classification.  We also tried
> > using CONNMARK to mark the connection and then restore it for each
> > packet.  In fact, this would be our preference as it is the lowest
> > overhead solution but it failed.  Perhaps the mark is not preserved when
> > the packet is encrypted.  I have asked on the Linux net-dev list but
> > have not received a response.  The only thing that worked was the
> > iptables CLASSIFY target.  To avoid creating the same rule for every
> > different interface, we created a user defined ESHAPE (Egress SHAPE)
> > chain and jumped all traffic going out on the physical interface or on
> > the virtual interfaces (e.g., tun0) to it, e.g., 
> > 
> > iptables -t mangle -N ESHAPE
> > iptables -t mangle -A POSTROUTING -o eth1 -j ESHAPE
> > iptables -t mangle -A POSTROUTING -o tun+ -j ESHAPE
> > iptables -t mangle -A ESHAPE -p 6 --sport 82 -j CLASSIFY --set-class 1:10
> > iptables -t mangle -A ESHAPE -p 6 --sport 443 -j CLASSIFY --set-class 1:10
> > 
> > We had some concern that the encapsulated traffic, e.g., ESP or UDP port
> > 1194, would be classified into the default HFSC queue and drag down any
> > prioritized traffic but this does not appear to be the case.  We tested
> > by having only rt curves, setting the default one much lower than the
> > prioritized curve, and sending prioritized traffic through the tunnel;
> > it all passed at the prioritized rate.
> > Ingress VPN Traffic Shaping
> > The issues were quite different on ingress. We similarly created an IFB
> > interface to coalesce the traffic but visibility was not a problem.  The
> > IFB interface saw the unencrypted traffic all the time.  However, we
> > could not use the CLASSIFY target since it cannot be used in the mangle
> > table PREROUTING chains.  We could not use packet marking since the
> > packets arrive on the IFB interface before they have been marked.  Thus,
> > the only option was tc filters and generally complicated linked filters
> > so that we accommodate the rare case where IP options are used thus
> > throwing off the calculation of the TCP packet offsets (because the IP
> > header then becomes 24 rather than 20 bytes).
> > We also have a problem that the netkey packets are placed in the default
> > HFSC queue and can drag down any decrypted priority traffic.  Thus, we
> > need to create a separate, high service queue for the encapsulated
> > traffic.  This does not appear to be necessary for tun+ traffic and, we
> > assume, ipsec+ traffic.  Since ingress traffic shaping works on back
> > pressure, a high speed queue for encrypted traffic should not create a
> > problem.  In other words, even if we accept encapsulated traffic at a
> > higher rate than we want, the outflow of decrypted traffic to the
> > internal network is constrained by the rest of the HFSC queue forcing
> > packet drops of excessive traffic which should slow down the sending
> > stream thus regulating the encrypted packets as well as the decrypted
> > packets.
> > We thought about using this principle of back pressure to move the
> > traffic shaping to the egress of the various internal interfaces but
> > that would have still required an IFB interface to coalesce the traffic
> > and would have required redirects for every interface.  This, an ingress
> > filter seems more efficient.
> > Sample test script
> > #!/bin/sh
> > modprobe ifb
> > ifconfig ifb0 up
> > ifconfig ifb1 up
> > tc qdisc replace dev eth1 root handle 2: prio bands 2 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > tc qdisc replace dev tun0 root handle 3: prio bands 2 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > tc qdisc replace dev ifb1 root handle 1 hfsc default 20
> > tc class replace dev ifb1 parent 1:0 classid 1:1 hfsc ul rate 100000kbit ls rate 100000kbit
> 
> Are these classes at fixed maximum rates?
> 
> > tc class replace dev ifb1 parent 1:1 classid 1:20 hfsc rt rate 150kbit #ls rate 40000kbit
> > tc class replace dev ifb1 parent 1:1 classid 1:10 hfsc rt rate 500kbit #ls rate 50000kbit
> > tc class replace dev ifb1 parent 1:1 classid 1:30 hfsc sc rate 10000kbit
> > tc qdisc replace dev ifb1 parent 1:20 handle 1201 sfq
> > tc qdisc replace dev ifb1 parent 1:10 handle 1101 sfq
> > tc qdisc replace dev ifb1 parent 1:30 handle 1301 sfq
> > iptables -t mangle -N ESHAPE
> > iptables -t mangle -A POSTROUTING -o eth1 -j ESHAPE
> > iptables -t mangle -A POSTROUTING -o tun+ -j ESHAPE
> > iptables -t mangle -A ESHAPE -p 6 --sport 82 -j CLASSIFY --set-class 1:10
> > iptables -t mangle -A ESHAPE -p 6 --sport 443 -j CLASSIFY --set-class 1:10
> > iptables -t mangle -A ESHAPE -p 6 --sport 822 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --dport 822 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --tcp-flags SYN,RST,ACK,FIN ACK -m length --length 20:43 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --sport 53 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --dport 53 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --sport 500 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --dport 500 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --sport 4500 -j CLASSIFY --set-class 1:30
> > iptables -t mangle -A ESHAPE -p 6 --dport 4500 -j CLASSIFY --set-class 1:30
> > 
> > tc qdisc replace dev ifb0 root handle 4 hfsc default 20
> > tc class replace dev ifb0 parent 4:0 classid 4:1 hfsc ul rate 100000kbit ls rate 100000kbit
> > tc class replace dev ifb0 parent 4:1 classid 4:20 hfsc rt rate 150kbit #ls rate 40000kbit
> > tc class replace dev ifb0 parent 4:1 classid 4:10 hfsc rt rate 500kbit #ls rate 50000kbit
> > tc class replace dev ifb0 parent 4:1 classid 4:30 hfsc sc rate 10000kbit
> > tc class replace dev ifb0 parent 4:1 classid 4:40 hfsc rt rate 100000kbit
> > tc qdisc replace dev ifb0 parent 4:20 handle 4201 sfq
> > tc qdisc replace dev ifb0 parent 4:10 handle 4101 sfq
> > tc qdisc replace dev ifb0 parent 4:30 handle 4301 sfq
> > tc qdisc replace dev ifb0 parent 4:40 handle 4401 sfq
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 1 u32 match ip protocol 50 0xff flowid 4:40
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 handle 16: u32 divisor 1
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 match ip protocol 6 0xff link 16: offset at 0 mask 0x0f00 shift 6 plus 0
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp dst 822 0xffff at nexthdr+2 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp src 822 0xffff at nexthdr+0 flowid 4:30
> > # Send packets <64 bytes (u16 0 0xffc0 at 2) with only the ACK flag set (match u8 16 0xff at nexthdr+13) to the low latency queue
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match u16 0 0xffc0 at 2 match u8 16 0xff at nexthdr+13 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp src 443 0xffff at nexthdr+0 flowid 4:10
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 16:0 match tcp src 82 0xffff at nexthdr+0 flowid 4:10
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 handle 117: u32 divisor 1
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 match ip protocol 17 0xff link 117: offset at 0 mask 0x0f00 shift 6 plus 0
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp dst 53 0xffff at nexthdr+2 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp src 53 0xffff at nexthdr+0 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp dst 500 0xffff at nexthdr+2 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp src 500 0xffff at nexthdr+0 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp dst 4500 0xffff at nexthdr+2 flowid 4:30
> > tc filter replace dev ifb0 parent 4:0 protocol ip prio 2 u32 ht 117:0 match udp src 4500 0xffff at nexthdr+0 flowid 4:30
> > 
> > ip link set eth1 txqueuelen 10
> > ip link set tun0 txqueuelen 10
> 
> Are the ethertool statements part of the shaping or a hardware fix?
> > ethtool -K eth1 gso off gro off
> > ethtool -K eth0 gso off gro off
> > ethtool -K eth2 gso off gro off
> 
> In the next 2 lines is this where the other traffic that was not handled by the ESHAPE chain get gets put into ifb1 ?
> 
> > 
> > tc filter replace dev eth1 parent 2:0 protocol ip prio 1 u32 match u8 0 0 flowid 2:1 action mirred egress redirect dev ifb1
> > tc filter replace dev tun0 parent 3:0 protocol ip prio 1 u32 match u8 0 0 flowid 3:1 action mirred egress redirect dev ifb1
> > tc qdisc replace dev eth1 ingress
> > tc filter replace dev eth1 parent ffff: protocol ip prio 1 u32 match u8 0 0 action mirred egress redirect dev ifb0
> > tc qdisc replace dev tun0 ingress
> > tc filter replace dev tun0 parent ffff: protocol ip prio 1 u32 match u8 0 0 action mirred egress redirect dev ifb0
> > 
> > Note that we have prioritized the IKE packets used to manage the IPSec
> > connections.
> > 
> Thanks for the information so far.
> 
> John
> --
> To unsubscribe from this list: send the line "unsubscribe lartc" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe lartc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Traffic shapping and vpn tunnel problems

Linux Advanced Routing and Traffic Control