-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 I'm sure I took a wrong turn somewhere, but when performing congestion tests, I'm seeing packets dropped in higher priority queues and not in lower ones. Can someone help me troubleshoot what is wrong? enp7s0f{0,1} are in an LACP bond and we are only testing to one destination host, hence the skew in the tc ouput. Test traffic is generated by: # iperf3 -c 10.217.89.26 -p 5201 -u -b 80G -t 600 -i 5 The host is a dual 40 Gb NIC. The goal is to saturate the outbound queues to enforce priority queuing and test that the switch is honoring COS (the ingress switch interrogates the DSCP and marks COS based on the DSCP value, we have confirmed with wireshark that COS is set properly). We have 10 hosts sending traffic like this to one host to saturate the downlink. High priority traffic is being dropped causing service disruptions during the test which we are attempting to eliminate. #!/bin/sh #set -x if [ $1 == "bond0" ]; then INTERFACES="enp7s0f0 enp7s0f1" for i in $INTERFACES; do # Clear what might be there tc qdisc del dev $i root # Add priority queue at the root of the interface tc qdisc add dev $i root handle 1: prio # Add sfq to each priority band to give each destination # a chance to get traffic tc qdisc add dev $i parent 1:1 handle 10: sfq tc qdisc add dev $i parent 1:2 handle 20: sfq tc qdisc add dev $i parent 1:3 handle 30: sfq done # Flush the POSTROUTING chain iptables -t mangle -F POSTROUTING # Don't mess with the loopback device iptables -t mangle -A POSTROUTING -o lo -j ACCEPT # Remark the Ceph heartbeat packets iptables -t mangle -A POSTROUTING -m dscp --dscp 0x30 -j DSCP --set-dscp 0x2e # Traffic destined for the monitors should get priority iptables -t mangle -A POSTROUTING -p tcp --dport 6789 -j DSCP --set-dscp 0x2e # All traffic going out the management interface is high priority iptables -t mangle -A POSTROUTING -o bond0.202 -j DSCP --set-dscp 0x2e # Send the high priority traffic to the tc 1:1 queue of the adapter iptables -t mangle -A POSTROUTING -m dscp --dscp 0x2e -j CLASSIFY --set-class 0001:0001 # Stop processing high priority traffic so it doesn't get messed up iptables -t mangle -A POSTROUTING -m dscp --dscp 0x2e -j ACCEPT # Set the storage/replication traffic destined to another ceph process a higher priority # than other traffic. Heartbeats were taken care of already iptables -t mangle -A POSTROUTING -p tcp --match multiport --dports 6800:6899 -j DSCP --set-dscp 0x20 iptables -t mangle -A POSTROUTING -p tcp --match multiport --sports 6800:6899 -j DSCP --set-dscp 0x20 # Send the replication traffic to the tc 1:2 queue of the adapter iptables -t mangle -A POSTROUTING -m dscp --dscp 0x20 -j CLASSIFY --set-class 0001:0002 # Stop processing low priority traffic iptables -t mangle -A POSTROUTING -m dscp --dscp 0x20 -j ACCEPT # Whatever is left is best effort. We don't need to mark it because it will get # the default DSCP of 0. Just send it to the lowest tc class 1:3 iptables -t mangle -A POSTROUTING -j CLASSIFY --set-class 0001:0003 fi # tc -s qdisc show qdisc prio 1: dev enp7s0f0 root refcnt 65 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 5165889658248 bytes 2479566296 pkt (dropped 1069, overlimits 0 requeues 756552) backlog 0b 0p requeues 756552 qdisc sfq 10: dev enp7s0f0 parent 1:1 limit 127p quantum 9014b depth 127 divisor 1024 Sent 44182236606 bytes 393986229 pkt (dropped 200, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 20: dev enp7s0f0 parent 1:2 limit 127p quantum 9014b depth 127 divisor 1024 Sent 4924468425634 bytes 2061625823 pkt (dropped 869, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 30: dev enp7s0f0 parent 1:3 limit 127p quantum 9014b depth 127 divisor 1024 Sent 197238996008 bytes 23954244 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc prio 1: dev enp7s0f1 root refcnt 65 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 5534963706040 bytes 2865612577 pkt (dropped 800, overlimits 0 requeues 753797) backlog 0b 0p requeues 753797 qdisc sfq 10: dev enp7s0f1 parent 1:1 limit 127p quantum 9014b depth 127 divisor 1024 Sent 45331019021 bytes 402412195 pkt (dropped 248, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 20: dev enp7s0f1 parent 1:2 limit 127p quantum 9014b depth 127 divisor 1024 Sent 5489632685964 bytes 2463200368 pkt (dropped 552, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 30: dev enp7s0f1 parent 1:3 limit 127p quantum 9014b depth 127 divisor 1024 Sent 1055 bytes 14 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 # iptables -t mangle -L POSTROUTING -vn Chain POSTROUTING (policy ACCEPT 24M packets, 197G bytes) pkts bytes target prot opt in out source destination 30M 3853M ACCEPT all -- * lo 0.0.0.0/0 0.0.0.0/0 792M 76G DSCP all -- * * 0.0.0.0/0 0.0.0.0/0 DSCP match 0x30 DSCP set 0x2e 3567K 2368M DSCP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:6789 DSCP set 0x2e 543K 201M DSCP all -- * bond0.202 0.0.0.0/0 0.0.0.0/0 DSCP set 0x2e 796M 78G CLASSIFY all -- * * 0.0.0.0/0 0.0.0.0/0 DSCP match 0x2e CLASSIFY set 1:1 796M 78G ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 DSCP match 0x2e 323M 4044G DSCP tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 6800:6899 DSCP set 0x20 455M 6112G DSCP tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport sports 6800:6899 DSCP set 0x20 778M 10T CLASSIFY all -- * * 0.0.0.0/0 0.0.0.0/0 DSCP match 0x20 CLASSIFY set 1:2 778M 10T ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 DSCP match 0x20 24M 197G CLASSIFY all -- * * 0.0.0.0/0 0.0.0.0/0 CLASSIFY set 1:3 Thank you for helping. - ---------------- Robert LeBlanc GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -----BEGIN PGP SIGNATURE----- Version: Mailvelope v0.13.1 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJVZef5CRDmVDuy+mK58QAASHwQAL6IRdXh5q+QBhFqrmwk yRlsgnLm0lGPA34nRtVi6Av0XV0htDxUoF0l9oTks0+D1TH+uKqq4yb19R4+ XHxODo/Flz9zb9j75KSmv7lJbscYvWeIAtnbvxvk2slBKx+eOPLIeivoMoKb V/YBvzyFVWcfgniF1hZKhpEgVGIxTKLvb43PRPJsJ/IsSJLeTEEIsJtBcSKj Z2a0QqwQ2/Tn+yuLJ50aK7Ze3pZo8Qaq5GNk/GwUa8mxWL6ctHNROGXLynMy g2AeWLb5aHsUPwwBo5i5SSxXdXrrNne+msA9jzcqOFIQq/10ZSVBTRP6wkUa QtygPpPQLclKCs2BLve4CbX5TLlRGP2y1rSQchzPeVg1JkA5r3ETVYseDB9w ylQ7H4i1+ZoNIGHKtBSilU/5vQxEwHM+Ol5sLjp2qk0htUK/CmU0MrDBuUsm /kaXkfZTjklObAyhjOv2wW08QTdJLaC0eJ2tP2o8deJvQA66JcWVcN0Cxjxa 3KkLf2whQT3ZEoKRlPXz+T5hKqikCjkkGzK+dY1xBfsTPkl5aleFHi3YWD8K rbPhFVU+AANvThF9TD+oEpEzlC/Fwhjt4huUqJL+HPPAJJgwVbnfTuCbf5nC 9S2CLfT6Qi3odSXbaRNi2ecVnrxVqUwiqrdZhyer+3CIoE5I9RYXVFM3gHeX MxHH =aDRb -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe lartc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html