Re: [PATCH v2] netfilter: nf_conntrack: resolve clash for matching conntracks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 07, 2018 at 12:44:10PM +0200, Martynas Pumputis wrote:
> On 4 July 2018 at 19:14, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > On Mon, Jul 02, 2018 at 04:52:14PM +0200, Martynas Pumputis wrote:
> >> This patch enables the clash resolution for NAT (disabled in
> >> "590b52e10d41") if clashing conntracks match (i.e. both tuples are equal)
> >> and a protocol allows it.
> >>
> >> The clash might happen for a connections-less protocol (e.g. UDP) when
> >> two threads in parallel writes to the same socket and consequent calls
> >> to "get_unique_tuple" return the same tuples (incl. reply tuples).
> >>
> >> In this case it is safe to perform the resolution, as the losing CT
> >> describes the same mangling as the winning CT, so no modifications to
> >> the packet are needed, and the result of rules traversal for the loser's
> >> packet stays valid.
> >>
> >> Signed-off-by: Martynas Pumputis <martynas@weave.works>
> >> ---
> >>  net/netfilter/nf_conntrack_core.c | 31 +++++++++++++++++++++++--------
> >>  1 file changed, 23 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> >> index 3465da2a98bd..cc5ef8c9d0da 100644
> >> --- a/net/netfilter/nf_conntrack_core.c
> >> +++ b/net/netfilter/nf_conntrack_core.c
> >> @@ -502,6 +502,18 @@ nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
> >>              net_eq(net, nf_ct_net(ct));
> >>  }
> >>
> >> +static inline bool
> >> +nf_ct_match(const struct nf_conn *ct1, const struct nf_conn *ct2)
> >> +{
> >> +     return nf_ct_tuple_equal(&ct1->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
> >> +                              &ct2->tuplehash[IP_CT_DIR_ORIGINAL].tuple) &&
> >> +            nf_ct_tuple_equal(&ct1->tuplehash[IP_CT_DIR_REPLY].tuple,
> >> +                              &ct2->tuplehash[IP_CT_DIR_REPLY].tuple) &&
> >> +            nf_ct_zone_equal(ct1, nf_ct_zone(ct2), IP_CT_DIR_ORIGINAL) &&
> >> +            nf_ct_zone_equal(ct1, nf_ct_zone(ct2), IP_CT_DIR_REPLY) &&
> >> +            net_eq(nf_ct_net(ct1), nf_ct_net(ct2));
> >> +}
> >> +
> >>  /* caller must hold rcu readlock and none of the nf_conntrack_locks */
> >>  static void nf_ct_gc_expired(struct nf_conn *ct)
> >>  {
> >> @@ -696,18 +708,21 @@ static int nf_ct_resolve_clash(struct net *net, struct sk_buff *skb,
> >>       struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
> >>       const struct nf_conntrack_l4proto *l4proto;
> >>
> >> +     enum ip_conntrack_info oldinfo;
> >> +     struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
> >> +
> >>       l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct));
> >>       if (l4proto->allow_clash &&
> >> -         ((ct->status & IPS_NAT_DONE_MASK) == 0) &&
> >>           !nf_ct_is_dying(ct) &&
> >>           atomic_inc_not_zero(&ct->ct_general.use)) {
> >> -             enum ip_conntrack_info oldinfo;
> >> -             struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
> >> -
> >> -             nf_ct_acct_merge(ct, ctinfo, loser_ct);
> >> -             nf_conntrack_put(&loser_ct->ct_general);
> >> -             nf_ct_set(skb, ct, oldinfo);
> >> -             return NF_ACCEPT;
> >> +             if (((ct->status & IPS_NAT_DONE_MASK) == 0) ||
> >> +                 nf_ct_match(ct, loser_ct)) {
> >> +                     nf_ct_acct_merge(ct, ctinfo, loser_ct);
> >> +                     nf_conntrack_put(&loser_ct->ct_general);
> >> +                     nf_ct_set(skb, ct, oldinfo);
> >> +                     return NF_ACCEPT;
> >
> > This will not work because the packet losing race will get mangled by
> > when NAT is performed.
> 
> The idea of this patch is to resolve the conflict only among packets
> with the same mangling (= with matching tuples). The mangling happens
> before the confirmation. Please correct me if I'm missing something.

Yes, the mangling happens before confirmation, but will mangle the
source port to make sure two makes don't get the same tuple. Hence,
the two packets will get different source port numbers, that's why we
need the code to undo the NAT mangling in
368982cd7d1bd41cd39049c794990aca3770db44.

Have a look at tcpdump to make sure packets are mangled consistently,
not just going through / not dropped.

> > Did you test this? Or this is some sort of RFC?
> 
> I've tested it and put the steps for reproducing the issue and logs
> for different configurations at https://github.com/brb/conntrack-race.

OK, are you watching for packets drops to make sure your fix work, I
mean, how are you doing the validation?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux