Conntrack insertion race conditions -- any workarounds?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I have been using nfqueue in conjunction with conntrack to
monitor/police flows on containers in my kubernetes cluster. This
worked until I started pushing UDP traffic through my nfqueue service.
At that point, I began to experience issues with DNS queries -- they
would take forever!

In particular, I noticed that two queries would come out almost in
parallel: an A and a AAAA query. The AAAA would almost always get
dropped after going through nfqueue. After proving to myself that my
service wasn't at fault, I started digging, and came across a few
posts discussing the issue. For example, see
https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/.

Basically, I'm running into an issue within conntrack whereby two
packets with the same connection tuple race to enter the table. The
loser is dropped. I confirmed that I was hitting this condition by
checking the conntrack stats, which show "insert_failed" and "drop"
increasing every time the condition occurs. The counters do not
increase otherwise.

Right now I am running Ubuntu 18.04, with its stock kernel: 4.15.0-32-generic.

I understand that a few fixes for this issue are in progress, or have
been merged into the kernel. However, I do not have control over the
kernel I will be running, so there is a good chance that any fixes
will not be in place.

I'm wondering if anyone has suggestions for workarounds I could put in
place? The most promising one I saw involved using tc to place a delay
on AAAA packets. However, I could not get that to work -- my service
runs on traffic *leaving* the container, meaning that a rule on egress
from the interface is too late. I could not figure out how to force
traffic entering from a local process to hit tc prior to going through
conntrack. I'm also concerned that other UDP services may hit the same
issue if they exhibit similar traffic patterns.

Some thoughts I have right now are:
1. Add a "delay" queue where my service delays the AAAA packets prior
to punting them to the main queue.
2. Don't use conntrack at all (which will really hurt performance -- I
don't need to see every packet)
3. Use something other than nfqueue (does anyone have suggestions for
alternatives which would allow me to see the L3 contents of packets
inline, and possibly decide on them?)
4. ???

Any help is greatly appreciated.

Thanks!

Kyle



[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux