Re: Conntrack insertion race conditions -- any workarounds?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 06, 2018 at 09:01:09AM -0400, Kyle Larose wrote:
> Hello,
> 
> I have been using nfqueue in conjunction with conntrack to
> monitor/police flows on containers in my kubernetes cluster. This
> worked until I started pushing UDP traffic through my nfqueue service.
> At that point, I began to experience issues with DNS queries -- they
> would take forever!
> 
> In particular, I noticed that two queries would come out almost in
> parallel: an A and a AAAA query. The AAAA would almost always get
> dropped after going through nfqueue. After proving to myself that my
> service wasn't at fault, I started digging, and came across a few
> posts discussing the issue. For example, see
> https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/.
> 
> Basically, I'm running into an issue within conntrack whereby two
> packets with the same connection tuple race to enter the table. The
> loser is dropped. I confirmed that I was hitting this condition by
> checking the conntrack stats, which show "insert_failed" and "drop"
> increasing every time the condition occurs. The counters do not
> increase otherwise.

This sounds like the issue discussed e.g. here:

  https://www.mail-archive.com/netdev@xxxxxxxxxxxxxxx/msg232633.html

which should be addressed by recent commit 368982cd7d1b ("netfilter:
nfnetlink_queue: resolve clash for unconfirmed conntracks") in 4.18-rc1.

> Right now I am running Ubuntu 18.04, with its stock kernel:
> 4.15.0-32-generic.

Could you check if 4.18.x or 4.19-rc2 kernel behaves differently?

> I understand that a few fixes for this issue are in progress, or have
> been merged into the kernel. However, I do not have control over the
> kernel I will be running, so there is a good chance that any fixes
> will not be in place.

If you could confirm that the commit above (with its prerequisities)
resolves the issue, it may be possible to backport it to the
distribution kernel. I tried to backport it to our 4.4 based kernel and
it wasn't very hard, backporting to 4.15 based one should be easier.

Michal Kubecek



[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux