Re: clogging qdisc

Linux Advanced Routing and Traffic Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27.12.2018 19:23, Grant Taylor wrote:
On 12/27/18 10:15 AM, Grzegorz Gwóźdź wrote:
This solution worked for few years in several networks but in one network since few weeks, in peak hours that mechanism clogs.

Okay.  It sounds to me like the methodology works well enough. But it might have scaling problems.


Previous system it was 2 x 6-core xeon (24 threads)
Now it's threadripper 2990 (64 threads)

No core is loaded over 40% (max 25% software interrupts per core only because NIC has 8 queues per interface so it can be operated by 8 cores per interface)
Few ksoftirqd on about 0.7% cpu

How can I find choke point if no parameter indicates it?


Pings to all local hosts grows to hundreds ms (even to hosts without any traffic) and throughtput drops.

Ouch.

The only solution is:
tc qdisc del root dev eth0

That doesn't seem like a solution.  Maybe a workaround, if you're lucky.

If I immediately add rules again problem immediately starts too.

That sounds like the workaround doesn't even work.

But after some time even though traffic is bigger I load queues and everything works until next attack

I'm thinking that "attack" might be the proper word.

I'm wondering if this is a number of packets per second vs the size of packets per second issue.

Specifically if the "attack" is considerably more smaller packets than normal.  I'm guessing normal traffic is fewer but bigger packets.

Take a packet capture during normal traffic periods and a separate packet capture during attack traffic periods.

Then open each of the captures in Wireshark and pull up the Packet Lengths report from the Statistics menu.  I'm guessing that you will see a significant difference between the two captures.


I put separate machine on mirrored interface just for logging traffic - statistically there is no difference both from LAN and from Internet. After clogging, number of small packets rises (mostly SYN packets) and overall number of packets drops but just before clogging I cannot see anything strange. But it's over 1Gbps and 100k pps
I'm almost sure the cause is in the incoming packets but I'ts hard to find

10s statistics under "attack", roughly the same as normal
==================================================================================================================================
Packet Lengths:
Topic / Item       Count         Average       Min val       Max val       Rate (ms)     Percent       Burst rate    Burst start
----------------------------------------------------------------------------------------------------------------------------------
Packet Lengths     3416240       866,96        60 1518          341,6367      100%          443,1200 3,735  0-19              0             -             - -             0,0000        0,00%         - -  20-39             0             -             - -             0,0000        0,00%         - -  40-79             987398        65,82         60 79            98,7435       28,90%        129,1300 2,865  80-159            323407        103,61        80 159           32,3419       9,47%         48,5500 1,905  160-319           98202         218,05        160 319           9,8206        2,87%         12,1800 3,360  320-639           72602         467,25        320 639           7,2605        2,13%         16,1800 4,680  640-1279          57417         969,00        640 1279          5,7419        1,68%         7,6000 1,510  1280-2559         1877214       1466,15       1280 1518          187,7284      54,95%        257,1100 3,705  2560-5119         0             -             - -             0,0000        0,00%         - -  5120 and greater  0             -             - -             0,0000        0,00%         - - ----------------------------------------------------------------------------------------------------------------------------------

Statistics above shows 300k pps because its whole interface IN + OUT for both containers

Default queue for unclassified packets is 2Gbps so everything that doesn't have defined filter fits there.

Smaller or bigger packets - every packet should fall into designed queue if it is normal packet but maybe it's maliciously prepared frame that breaks rules and algorithms


I don't think it is hardware issue because this system works in LXC container and on the same NIC in other container (doing the same work for other clients) everything works fine.

I think containers and VMs are good for some things.  I don't think that they are (more specifically their overhead is) good for high throughput traffic.  Particularly large numbers of small packets per second (PPS). High PPS with small traffic requires quite a bit of optimization.  I also think it's actually rare outside of specific situations.  What's I've seen more frequently is fewer (by one or more orders of magnitude) packets that are larger (by one or more orders of magnitude).  Overall the amount of data is roughly the same.  But /how/ it's done can cause considerable load on equipment.  Especially equipment that is not optimized for high PPS, much less additional overhead like containers or VMs.

It's lxc so there is almost no overhead on network traffic
I estimate this machine could handle over 40Gbps

There are 2 lxc containers on this machine. One is handling 1.7Gbps and is working fine, second has 1.2Gbps and has problems.
Both work on the same physical interface.
I've moved one container to other machine - no difference - one, that was clogging - clogged still


Load on system is low, there is no hardware problem, whole hardware has been replaced, on new hardware I've installed new system (Ubuntu 18.04)
No dropped packets in interface statistics. dmesg is clear.

What messages were you seeing in dmesg before?

I mean there is nothing new in dmesg



As a result conntrack table grows until overflow (if I don't delete qdisc) I even sniffed all traffic and tried to analyze it but it's hard since it's over 1Gbps (on 10Gb interface)

The connection tracking table overflowing tells me one of two things. That you are truly dealing with a high PPS condition -or- that you don't have enough memory in the system and the size of the conntrack table is restricted.

I once took a system that comfortable ran with ~512 MB of memory up to 4 GB to allow the conntrack table to be large enough for what the system was doing.  (I think the conntrack table was a fixed percentage of memory in that kernel.  Maybe it's a tune able now.)

conntrack table overflows because clients try to establish connections but can not wait for reply (clogged TC blocks them). It fills up for about 10 minutes and then overflows


What can I check?

Check the Packet Lengths graph as suggested above.

If your problem is indeed high PPS, you might also be having problem outside of your Linux machine.  It's quite possible that there is enough traffic / PPS that things like switches and / or wireless access points are also being negatively effected.  It's possible that they are your choke point and not the actual Linux system.

But deleting qdisc causes traffic growth since clients have no limits. Choke point is certainly in my Linux box.

And if it was NIC issue it would go worse after deleting qdisc (more traffic)



Where to look for a cause?

I think you need to get a good understanding of what your two typical traffic patterns are, normal, and attack.  Including if this is legitimate traffic or if it is someone conducting an attack and the network is buckling under the stress.

You might also consider changing out network cards.  I've been around people that like to poo poo some Realtek cards and other non-Intel / non-Broadcom NICs.  Admittedly, some of the better NICs have more CPU / memory / I/O on them to handle more traffic.

I use 10Gb Mellanox NIC. I can try Intel's but Mellanox always performed better for me.


Are there any "hacks" in TC allowing to look in the guts?

It looks like it's changing state to "clogged" but

tc -s class ls dev eth0

looks completely normal (only grows number of sfq queues created dynamically for every connection since more and more connections are created but not closed)

If I don't delete qdisc it recovers from that state after about 15 minutes (but earlier traffic is crushed to 1/3, conntrack overflows, etc)

At the same time system itself responds pretty well, there is no sign of anything going wrong, similar system in other container works ok...


GG




[Index of Archives]     [LARTC Home Page]     [Netfilter]     [Netfilter Development]     [Network Development]     [Bugtraq]     [GCC Help]     [Yosemite News]     [Linux Kernel]     [Fedora Users]
  Powered by Linux