Re: [PATCH v2] netfilter: nf_conntrack: resolve clash for matching conntracks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4 July 2018 at 19:14, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> On Mon, Jul 02, 2018 at 04:52:14PM +0200, Martynas Pumputis wrote:
>> This patch enables the clash resolution for NAT (disabled in
>> "590b52e10d41") if clashing conntracks match (i.e. both tuples are equal)
>> and a protocol allows it.
>>
>> The clash might happen for a connections-less protocol (e.g. UDP) when
>> two threads in parallel writes to the same socket and consequent calls
>> to "get_unique_tuple" return the same tuples (incl. reply tuples).
>>
>> In this case it is safe to perform the resolution, as the losing CT
>> describes the same mangling as the winning CT, so no modifications to
>> the packet are needed, and the result of rules traversal for the loser's
>> packet stays valid.
>>
>> Signed-off-by: Martynas Pumputis <martynas@weave.works>
>> ---
>>  net/netfilter/nf_conntrack_core.c | 31 +++++++++++++++++++++++--------
>>  1 file changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
>> index 3465da2a98bd..cc5ef8c9d0da 100644
>> --- a/net/netfilter/nf_conntrack_core.c
>> +++ b/net/netfilter/nf_conntrack_core.c
>> @@ -502,6 +502,18 @@ nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
>>              net_eq(net, nf_ct_net(ct));
>>  }
>>
>> +static inline bool
>> +nf_ct_match(const struct nf_conn *ct1, const struct nf_conn *ct2)
>> +{
>> +     return nf_ct_tuple_equal(&ct1->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
>> +                              &ct2->tuplehash[IP_CT_DIR_ORIGINAL].tuple) &&
>> +            nf_ct_tuple_equal(&ct1->tuplehash[IP_CT_DIR_REPLY].tuple,
>> +                              &ct2->tuplehash[IP_CT_DIR_REPLY].tuple) &&
>> +            nf_ct_zone_equal(ct1, nf_ct_zone(ct2), IP_CT_DIR_ORIGINAL) &&
>> +            nf_ct_zone_equal(ct1, nf_ct_zone(ct2), IP_CT_DIR_REPLY) &&
>> +            net_eq(nf_ct_net(ct1), nf_ct_net(ct2));
>> +}
>> +
>>  /* caller must hold rcu readlock and none of the nf_conntrack_locks */
>>  static void nf_ct_gc_expired(struct nf_conn *ct)
>>  {
>> @@ -696,18 +708,21 @@ static int nf_ct_resolve_clash(struct net *net, struct sk_buff *skb,
>>       struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>>       const struct nf_conntrack_l4proto *l4proto;
>>
>> +     enum ip_conntrack_info oldinfo;
>> +     struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
>> +
>>       l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct));
>>       if (l4proto->allow_clash &&
>> -         ((ct->status & IPS_NAT_DONE_MASK) == 0) &&
>>           !nf_ct_is_dying(ct) &&
>>           atomic_inc_not_zero(&ct->ct_general.use)) {
>> -             enum ip_conntrack_info oldinfo;
>> -             struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
>> -
>> -             nf_ct_acct_merge(ct, ctinfo, loser_ct);
>> -             nf_conntrack_put(&loser_ct->ct_general);
>> -             nf_ct_set(skb, ct, oldinfo);
>> -             return NF_ACCEPT;
>> +             if (((ct->status & IPS_NAT_DONE_MASK) == 0) ||
>> +                 nf_ct_match(ct, loser_ct)) {
>> +                     nf_ct_acct_merge(ct, ctinfo, loser_ct);
>> +                     nf_conntrack_put(&loser_ct->ct_general);
>> +                     nf_ct_set(skb, ct, oldinfo);
>> +                     return NF_ACCEPT;
>
> This will not work because the packet losing race will get mangled by
> when NAT is performed.

The idea of this patch is to resolve the conflict only among packets
with the same mangling (= with matching tuples). The mangling happens
before the confirmation. Please correct me if I'm missing something.

>
> Did you test this? Or this is some sort of RFC?

I've tested it and put the steps for reproducing the issue and logs
for different configurations at https://github.com/brb/conntrack-race.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux