Re: [PATCH v2] netfilter: nf_conntrack: resolve clash for matching conntracks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9 July 2018 at 20:12, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> On Sat, Jul 07, 2018 at 12:44:10PM +0200, Martynas Pumputis wrote:
>> On 4 July 2018 at 19:14, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
>> > On Mon, Jul 02, 2018 at 04:52:14PM +0200, Martynas Pumputis wrote:
>> >> This patch enables the clash resolution for NAT (disabled in
>> >> "590b52e10d41") if clashing conntracks match (i.e. both tuples are equal)
>> >> and a protocol allows it.
>> >>
>> >> The clash might happen for a connections-less protocol (e.g. UDP) when
>> >> two threads in parallel writes to the same socket and consequent calls
>> >> to "get_unique_tuple" return the same tuples (incl. reply tuples).
>> >>
>> >> In this case it is safe to perform the resolution, as the losing CT
>> >> describes the same mangling as the winning CT, so no modifications to
>> >> the packet are needed, and the result of rules traversal for the loser's
>> >> packet stays valid.
>> >>
>> >> Signed-off-by: Martynas Pumputis <martynas@weave.works>
>> >> ---
>> >>  net/netfilter/nf_conntrack_core.c | 31 +++++++++++++++++++++++--------
>> >>  1 file changed, 23 insertions(+), 8 deletions(-)
>> >>
>> >> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
>> >> index 3465da2a98bd..cc5ef8c9d0da 100644
>> >> --- a/net/netfilter/nf_conntrack_core.c
>> >> +++ b/net/netfilter/nf_conntrack_core.c
>> >> @@ -502,6 +502,18 @@ nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
>> >>              net_eq(net, nf_ct_net(ct));
>> >>  }
>> >>
>> >> +static inline bool
>> >> +nf_ct_match(const struct nf_conn *ct1, const struct nf_conn *ct2)
>> >> +{
>> >> +     return nf_ct_tuple_equal(&ct1->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
>> >> +                              &ct2->tuplehash[IP_CT_DIR_ORIGINAL].tuple) &&
>> >> +            nf_ct_tuple_equal(&ct1->tuplehash[IP_CT_DIR_REPLY].tuple,
>> >> +                              &ct2->tuplehash[IP_CT_DIR_REPLY].tuple) &&
>> >> +            nf_ct_zone_equal(ct1, nf_ct_zone(ct2), IP_CT_DIR_ORIGINAL) &&
>> >> +            nf_ct_zone_equal(ct1, nf_ct_zone(ct2), IP_CT_DIR_REPLY) &&
>> >> +            net_eq(nf_ct_net(ct1), nf_ct_net(ct2));
>> >> +}
>> >> +
>> >>  /* caller must hold rcu readlock and none of the nf_conntrack_locks */
>> >>  static void nf_ct_gc_expired(struct nf_conn *ct)
>> >>  {
>> >> @@ -696,18 +708,21 @@ static int nf_ct_resolve_clash(struct net *net, struct sk_buff *skb,
>> >>       struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>> >>       const struct nf_conntrack_l4proto *l4proto;
>> >>
>> >> +     enum ip_conntrack_info oldinfo;
>> >> +     struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
>> >> +
>> >>       l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct));
>> >>       if (l4proto->allow_clash &&
>> >> -         ((ct->status & IPS_NAT_DONE_MASK) == 0) &&
>> >>           !nf_ct_is_dying(ct) &&
>> >>           atomic_inc_not_zero(&ct->ct_general.use)) {
>> >> -             enum ip_conntrack_info oldinfo;
>> >> -             struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
>> >> -
>> >> -             nf_ct_acct_merge(ct, ctinfo, loser_ct);
>> >> -             nf_conntrack_put(&loser_ct->ct_general);
>> >> -             nf_ct_set(skb, ct, oldinfo);
>> >> -             return NF_ACCEPT;
>> >> +             if (((ct->status & IPS_NAT_DONE_MASK) == 0) ||
>> >> +                 nf_ct_match(ct, loser_ct)) {
>> >> +                     nf_ct_acct_merge(ct, ctinfo, loser_ct);
>> >> +                     nf_conntrack_put(&loser_ct->ct_general);
>> >> +                     nf_ct_set(skb, ct, oldinfo);
>> >> +                     return NF_ACCEPT;
>> >
>> > This will not work because the packet losing race will get mangled by
>> > when NAT is performed.
>>
>> The idea of this patch is to resolve the conflict only among packets
>> with the same mangling (= with matching tuples). The mangling happens
>> before the confirmation. Please correct me if I'm missing something.
>
> Yes, the mangling happens before confirmation, but will mangle the
> source port to make sure two makes don't get the same tuple. Hence,
> the two packets will get different source port numbers, that's why we
> need the code to undo the NAT mangling in
> 368982cd7d1bd41cd39049c794990aca3770db44.

If two packets get different source port numbers (by
"get_unique_tuple"), then their CT tuples won't match (= "nf_ct_match"
will return false) => the conflict won't be resolved.

>
> Have a look at tcpdump to make sure packets are mangled consistently,
> not just going through / not dropped.

Yes, I've checked traces from tcpdump to see whether manglings are OK.
The logs I provided include the traces.

>
>> > Did you test this? Or this is some sort of RFC?
>>
>> I've tested it and put the steps for reproducing the issue and logs
>> for different configurations at https://github.com/brb/conntrack-race.
>
> OK, are you watching for packets drops to make sure your fix work, I
> mean, how are you doing the validation?

I'm validating the fix by logging packets for which the conflict
cannot be resolved and checking with tcpdump whether they have been
dropped.

>
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux