Re: [net-next PATCH 0/5] netfilter: conntrack: optimization, remove central spinlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Pablo,

This should obviously have been for nf-next, and I also forgot to cc
netfilter-devel@xxxxxxxxxxxxxxx ... do you want me to repost?

--Jesper


On Thu, 27 Feb 2014 17:41:10 +0100 Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote:

> This patchset change the conntrack locking and provides a huge
> performance improvements.
> 
> This patchset is based upon Eric Dumazet's proposed patch:
>   http://thread.gmane.org/gmane.linux.network/268758/focus=47306
> I have in agreement with Eric Dumazet, taken over this patch (and
> turned it into a entire patchset).
> 
> Primary focus is to remove the central spinlock nf_conntrack_lock.
> This requires several steps to be acheived.
> 
> Patch01: Trivial cleanups
> 
> Patch02: Moves the "special" dying/unconfirmed/template lists to use a
>  per cpu spinlock.
> 
> Patch03: Is preparing for patch04, as it address a race
>  condition. Doing this a seperate patch for reviewers sake.
> 
> Patch04: Seperates expect locking from nf_conntrack_lock. The expect
>  list is small (default max 256), this it just get a single lock.
> 
> Patch05: Finally can remove nf_conntrack_lock, and instead uses an
>  array of hashed spinlocks to protect insertions/deletions of
>  conntracks into the hash table.  While still allowing dynamic
>  resizing of the hash table.
> 
> 
> Testing
> -------
> For expectations I've mostly tested the FTP nf_conntrack_ftp
> helper module, by commands:
> 
>  for x in `seq 1 300`; do \
>    echo $x; \
>    echo -e "USER anonymous\nPASS nothing\nPASV" | nc 192.168.42.129 21; \
>  done
> 
>  wget ftp://192.168.42.129/pub/delete.me.4k -O /dev/null
> 
> For overload/DoS testing, I've primarily done, SYN-flood attack testing.
> Results on a 24-core E5-2695v2(ES) with 10Gbit/s ixgbe (with tool trafgen)
> 
>  Base kernel : New   810.405 conntrack/sec
>  Fixed kernel: New 2.233.876 conntrack/sec
> 
> Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
>  # iptables -A INPUT -m state --state INVALID -j DROP
>  # sysctl -w net/netfilter/nf_conntrack_tcp_loose=0
> 
> E.g. this machine can reflect 6.481.463 "invalid" conntrack/sec (from
> an ACK-flood).
> 
> Perf data:
> ----------
> The nf_conntrack_lock is suffers from huge contention on current
> generation servers (8 or more core/threads).  Data from under
> SYN-flooding (without a listen socket)
> 
> Perf locking congestion is very "visible" on a base kernel:
> 
>     -  72.56%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
>        - _raw_spin_lock_bh
>           + 25.33% init_conntrack
>           + 24.86% nf_ct_delete_from_lists
>           + 24.62% __nf_conntrack_confirm
>           + 24.38% destroy_conntrack
>           + 0.70% tcp_packet
>     +   2.21%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
>     +   1.15%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
>     +   0.77%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
>     +   0.70%  ksoftirqd/6  [nf_conntrack]       [k] nf_ct_delete
>     +   0.55%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table
> 
> Perf after the patchset (SYN-flood attack):
> 
> +   9.62%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
> +   3.78%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
> +   2.71%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
> +   2.55%  ksoftirqd/6  [kernel.kallsyms]    [k] check_leaf
> +   2.38%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table
> +   2.06%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_alloc
> +   1.94%  ksoftirqd/6  [nf_conntrack]       [k] __nf_conntrack_alloc
> -   1.94%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock
>    - _raw_spin_lock
>       + 90.32% nf_conntrack_double_lock
>       + 3.61% get_partial_node
>       + 1.81% nf_ct_delete_from_lists
>       + 1.68% __nf_conntrack_confirm
>       + 1.03% sch_direct_xmit
>       + 0.52% scheduler_tick
> +   1.86%  ksoftirqd/6  [kernel.kallsyms]    [k] nf_iterate
> +   1.80%  ksoftirqd/6  [nf_conntrack]       [k] init_conntrack
> +   1.77%  ksoftirqd/6  [kernel.kallsyms]    [k] __neigh_event_send
> -   1.70%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
>    - _raw_spin_lock_bh
>       + 32.55% nf_ct_del_from_dying_or_unconfirmed_list
>       + 25.33% init_conntrack
>       + 19.88% tcp_packet
>       + 17.97% nf_ct_delete_from_lists
>       + 1.62% nf_conntrack_in
>       + 1.33% ixgbe_poll
>       + 0.74% destroy_conntrack
> +   1.64%  ksoftirqd/6  [nf_conntrack]       [k] hash_conntrack_raw
> +   1.58%  ksoftirqd/6  [kernel.kallsyms]    [k] __netif_receive_skb_core
> +   1.51%  ksoftirqd/6  [nf_conntrack]       [k] __nf_conntrack_find_get
> +   1.48%  ksoftirqd/6  [kernel.kallsyms]    [k] __cmpxchg_double_slab
> +   1.46%  ksoftirqd/6  [nf_conntrack]       [k] nf_conntrack_in
> +   1.45%  ksoftirqd/6  [kernel.kallsyms]    [k] __local_bh_enable_ip
> 
> 
> ---
> 
> Jesper Dangaard Brouer (5):
>       netfilter: conntrack: remove central spinlock nf_conntrack_lock
>       netfilter: conntrack: seperate expect locking from nf_conntrack_lock
>       netfilter: avoid race with exp->master ct
>       netfilter: conntrack: spinlock per cpu to protect special lists.
>       netfilter: trivial code cleanup and doc changes
> 
> 
>  include/net/netfilter/nf_conntrack.h      |   11 +
>  include/net/netfilter/nf_conntrack_core.h |    9 +
>  include/net/netns/conntrack.h             |   13 +
>  net/netfilter/nf_conntrack_core.c         |  427 ++++++++++++++++++++---------
>  net/netfilter/nf_conntrack_expect.c       |   36 ++
>  net/netfilter/nf_conntrack_h323_main.c    |    4 
>  net/netfilter/nf_conntrack_helper.c       |   37 ++-
>  net/netfilter/nf_conntrack_netlink.c      |  128 +++++----
>  net/netfilter/nf_conntrack_sip.c          |    8 -
>  9 files changed, 456 insertions(+), 217 deletions(-)
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux