Re: [PATCH 1/3 nf-next] netfilter: nf_tables: add release callback in nft_expr_type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2018-05-01 6:22 GMT+09:00 Florian Westphal <fw@xxxxxxxxx>:
> Florian Westphal <fw@xxxxxxxxx> wrote:
>> Taehee Yoo <ap420073@xxxxxxxxx> wrote:
>> > In the second patch, you said that you can't reproduce this problem.
>> > If the nft_counter is unloaded, you can reproduce this problem.
>> > Could you please test this?
>>
>> Ineed, that reproduces this.
>> I think what nft_compat.c is doing in select_ops() is illegal,
>> select_ops should not have any side effects.
>>
>> I'll see if we can reorganize this and defer module_get until we
>> call ops->init().
>
> I came up with this and it seems to work.  Do you see
> any issues with this?
>

I have been testing this patch,
As result, that works very well!

My test cases and results are below

1:
   $rmmod nft_counter
   $iptables-compat -I OUTPUT -m cpu --cpu 0
   $iptables-compat -F
   $lsmod | grep xt_cpu
Module                  Size  Used by
xt_cpu                 16384  0

2:
   $rmmod nft_counter
   $iptables-compat -I OUTPUT -m cpu --cpu 0 -m cpu --cpu 1 ... -m cpu --cpu 200
   $iptables-compat -F
   $lsmod | grep xt_cpu
Module                  Size  Used by
xt_cpu                 16384  1
   $rmmod nft_compat
   $lsmod | grep xt_cpu
Module                  Size  Used by
xt_cpu                 16384  0

Other test results are also good!
(for example rule replace, ops->validate callback error scenario)
And there is no warning(after-use-free-access and others).


> Subject: netfilter: nf_tables: nft_compat: fix refcount leak on xt module
>
> Taehee Yoo reported following bug:
>   iptables-compat -I OUTPUT -m cpu --cpu 0
>   iptables-compat -F
>   lsmod |grep xt_cpu
>   xt_cpu                 16384  1
>
> Quote:
> "When above command is given, a netlink message has two expressions that
> are the cpu compat and the nft_counter.
> The nft_expr_type_get() in the nf_tables_expr_parse() successes
> first expression then, calls select_ops callback.
> (allocates memory and holds module)
> But, second nft_expr_type_get() in the nf_tables_expr_parse()
> returns -EAGAIN because of request_module().
> In that point, by the 'goto err1',
> the 'module_put(info[i].ops->type->owner)' is called.
> There is no release routine."
>
> The problem is that unlike all other expressions,
> nft_compat select_ops has side effects.
>
> 1. it allocates dynamic memory which holds an nft ops struct.
>    In all other expressions, ops has static storage duration.
> 2. It grabs references to the xt module that it is supposed to
>    invoke.
>
> Depending on where things go wrong, error unwinding doesn't
> always do the right thing.
>
> In the above scenario, a new nft_compat_expr is created and
> xt_cpu module gets loaded with a refcount of 1.
>
> Due to to -EAGAIN, the netlink messages get re-parsed.
> When that happens, nft_compat finds that xt_cpu is already present
> and increments module refcount again.
>
> This fixes the problem by making select_ops to have no visible
> side effects and removes all extra module_get/put.
>
> When select_ops creates a new nft_compat expression, the new
> expression has a refcount of 0, and the xt module gets its refcount
> incremented.
>
> When error happens, the next call finds existing entry, but will no
> longer increase the reference count -- the presence of existing
> nft_xt means we already hold a module reference.
>
> Because nft_xt_put is only called from nft_compat destroy hook,
> it will never see the initial zero reference count.
> ->destroy can only be called after ->init(), and that will increase the
> refcount.
>
> Lastly, we now free nft_xt struct with kfree_rcu.
> Else, we get use-after free in nf_tables_rule_destroy:
>
>   while (expr != nft_expr_last(rule) && expr->ops) {
>     nf_tables_expr_destroy(ctx, expr);
>     expr = nft_expr_next(expr); // here
>
> nft_expr_next() dereferences expr->ops. This is safe
> for all users, as ops have static storage duration.
> In nft_compat case however, its ->destroy callback can
> free the memory that hold the ops structure.
>
> Reported-by: Taehee Yoo <ap420073@xxxxxxxxx>
> Signed-off-by: Florian Westphal <fw@xxxxxxxxx>
> ---
>  net/netfilter/nft_compat.c | 94 +++++++++++++++++++++++++++++-----------------
>  1 file changed, 60 insertions(+), 34 deletions(-)
>
> diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
> index 8e23726b9081..6c96b32b4187 100644
> --- a/net/netfilter/nft_compat.c
> +++ b/net/netfilter/nft_compat.c
> @@ -27,14 +27,26 @@ struct nft_xt {
>         struct list_head        head;
>         struct nft_expr_ops     ops;
>         unsigned int            refcnt;
> +
> +       /* Unlike other expressions, ops doesn't have static storage duration
> +        * here, but nft core assumes they do, in particular it assumes ops remain
> +        * valid after calling ->destroy(), but in nft_compat case this could free
> +        * ops as well.
> +        *
> +        * Its enough to delay free operation with kfree_rcu.
> +        */
> +       struct rcu_head         rcu_head;
>  };
>
> -static void nft_xt_put(struct nft_xt *xt)
> +static bool nft_xt_put(struct nft_xt *xt)
>  {
>         if (--xt->refcnt == 0) {
>                 list_del(&xt->head);
> -               kfree(xt);
> +               kfree_rcu(xt, rcu_head);
> +               return true;
>         }
> +
> +       return false;
>  }
>
>  static int nft_compat_chain_validate_dependency(const char *tablename,
> @@ -226,6 +238,7 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
>         struct xt_target *target = expr->ops->data;
>         struct xt_tgchk_param par;
>         size_t size = XT_ALIGN(nla_len(tb[NFTA_TARGET_INFO]));
> +       struct nft_xt *nft_xt;
>         u16 proto = 0;
>         bool inv = false;
>         union nft_entry e = {};
> @@ -236,25 +249,22 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
>         if (ctx->nla[NFTA_RULE_COMPAT]) {
>                 ret = nft_parse_compat(ctx->nla[NFTA_RULE_COMPAT], &proto, &inv);
>                 if (ret < 0)
> -                       goto err;
> +                       return ret;
>         }
>
>         nft_target_set_tgchk_param(&par, ctx, target, info, &e, proto, inv);
>
>         ret = xt_check_target(&par, size, proto, inv);
>         if (ret < 0)
> -               goto err;
> +               return ret;
>
>         /* The standard target cannot be used */
> -       if (target->target == NULL) {
> -               ret = -EINVAL;
> -               goto err;
> -       }
> +       if (target->target == NULL)
> +               return -EINVAL;
>
> +       nft_xt = container_of(expr->ops, struct nft_xt, ops);
> +       nft_xt->refcnt++;
>         return 0;
> -err:
> -       module_put(target->me);
> -       return ret;
>  }
>
>  static void
> @@ -271,8 +281,8 @@ nft_target_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr)
>         if (par.target->destroy != NULL)
>                 par.target->destroy(&par);
>
> -       nft_xt_put(container_of(expr->ops, struct nft_xt, ops));
> -       module_put(target->me);
> +       if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops)))
> +               module_put(target->me);
>  }
>
>  static int nft_target_dump(struct sk_buff *skb, const struct nft_expr *expr)
> @@ -411,6 +421,7 @@ nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
>         struct xt_match *match = expr->ops->data;
>         struct xt_mtchk_param par;
>         size_t size = XT_ALIGN(nla_len(tb[NFTA_MATCH_INFO]));
> +       struct nft_xt *nft_xt;
>         u16 proto = 0;
>         bool inv = false;
>         union nft_entry e = {};
> @@ -421,19 +432,18 @@ nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
>         if (ctx->nla[NFTA_RULE_COMPAT]) {
>                 ret = nft_parse_compat(ctx->nla[NFTA_RULE_COMPAT], &proto, &inv);
>                 if (ret < 0)
> -                       goto err;
> +                       return ret;
>         }
>
>         nft_match_set_mtchk_param(&par, ctx, match, info, &e, proto, inv);
>
>         ret = xt_check_match(&par, size, proto, inv);
>         if (ret < 0)
> -               goto err;
> +               return ret;
>
> +       nft_xt = container_of(expr->ops, struct nft_xt, ops);
> +       nft_xt->refcnt++;
>         return 0;
> -err:
> -       module_put(match->me);
> -       return ret;
>  }
>
>  static void
> @@ -450,8 +460,8 @@ nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr)
>         if (par.match->destroy != NULL)
>                 par.match->destroy(&par);
>
> -       nft_xt_put(container_of(expr->ops, struct nft_xt, ops));
> -       module_put(match->me);
> +       if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops)))
> +               module_put(match->me);
>  }
>
>  static int nft_match_dump(struct sk_buff *skb, const struct nft_expr *expr)
> @@ -654,13 +664,8 @@ nft_match_select_ops(const struct nft_ctx *ctx,
>         list_for_each_entry(nft_match, &nft_match_list, head) {
>                 struct xt_match *match = nft_match->ops.data;
>
> -               if (nft_match_cmp(match, mt_name, rev, family)) {
> -                       if (!try_module_get(match->me))
> -                               return ERR_PTR(-ENOENT);
> -
> -                       nft_match->refcnt++;
> +               if (nft_match_cmp(match, mt_name, rev, family))
>                         return &nft_match->ops;
> -               }
>         }
>
>         match = xt_request_find_match(family, mt_name, rev);
> @@ -679,7 +684,7 @@ nft_match_select_ops(const struct nft_ctx *ctx,
>                 goto err;
>         }
>
> -       nft_match->refcnt = 1;
> +       nft_match->refcnt = 0;
>         nft_match->ops.type = &nft_match_type;
>         nft_match->ops.size = NFT_EXPR_SIZE(XT_ALIGN(match->matchsize));
>         nft_match->ops.eval = nft_match_eval;
> @@ -739,13 +744,8 @@ nft_target_select_ops(const struct nft_ctx *ctx,
>         list_for_each_entry(nft_target, &nft_target_list, head) {
>                 struct xt_target *target = nft_target->ops.data;
>
> -               if (nft_target_cmp(target, tg_name, rev, family)) {
> -                       if (!try_module_get(target->me))
> -                               return ERR_PTR(-ENOENT);
> -
> -                       nft_target->refcnt++;
> +               if (nft_target_cmp(target, tg_name, rev, family))
>                         return &nft_target->ops;
> -               }
>         }
>
>         target = xt_request_find_target(family, tg_name, rev);
> @@ -764,7 +764,7 @@ nft_target_select_ops(const struct nft_ctx *ctx,
>                 goto err;
>         }
>
> -       nft_target->refcnt = 1;
> +       nft_target->refcnt = 0;
>         nft_target->ops.type = &nft_target_type;
>         nft_target->ops.size = NFT_EXPR_SIZE(XT_ALIGN(target->targetsize));
>         nft_target->ops.init = nft_target_init;
> @@ -823,6 +823,32 @@ static int __init nft_compat_module_init(void)
>
>  static void __exit nft_compat_module_exit(void)
>  {
> +       struct nft_xt *xt, *next;
> +
> +       /* list should be empty here, it can be non-empty only in case there
> +        * was an error that caused nft_xt expr to not be initialized fully
> +        * and noone else requested the same expression later.
> +        *
> +        * In this case, the lists contain 0-refcount entries that still
> +        * hold module reference.
> +        */
> +       list_for_each_entry_safe(xt, next, &nft_target_list, head) {
> +               struct xt_target *target = xt->ops.data;
> +
> +               if (WARN_ON_ONCE(xt->refcnt))
> +                       continue;
> +               module_put(target->me);
> +               kfree(xt);
> +       }
> +
> +       list_for_each_entry_safe(xt, next, &nft_match_list, head) {
> +               struct xt_match *match = xt->ops.data;
> +
> +               if (WARN_ON_ONCE(xt->refcnt))
> +                       continue;
> +               module_put(match->me);
> +               kfree(xt);
> +       }
>         nfnetlink_subsys_unregister(&nfnl_compat_subsys);
>         nft_unregister_expr(&nft_target_type);
>         nft_unregister_expr(&nft_match_type);
> --
> 2.16.1
>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux