2018-05-01 6:22 GMT+09:00 Florian Westphal <fw@xxxxxxxxx>: > Florian Westphal <fw@xxxxxxxxx> wrote: >> Taehee Yoo <ap420073@xxxxxxxxx> wrote: >> > In the second patch, you said that you can't reproduce this problem. >> > If the nft_counter is unloaded, you can reproduce this problem. >> > Could you please test this? >> >> Ineed, that reproduces this. >> I think what nft_compat.c is doing in select_ops() is illegal, >> select_ops should not have any side effects. >> >> I'll see if we can reorganize this and defer module_get until we >> call ops->init(). > > I came up with this and it seems to work. Do you see > any issues with this? > I have been testing this patch, As result, that works very well! My test cases and results are below 1: $rmmod nft_counter $iptables-compat -I OUTPUT -m cpu --cpu 0 $iptables-compat -F $lsmod | grep xt_cpu Module Size Used by xt_cpu 16384 0 2: $rmmod nft_counter $iptables-compat -I OUTPUT -m cpu --cpu 0 -m cpu --cpu 1 ... -m cpu --cpu 200 $iptables-compat -F $lsmod | grep xt_cpu Module Size Used by xt_cpu 16384 1 $rmmod nft_compat $lsmod | grep xt_cpu Module Size Used by xt_cpu 16384 0 Other test results are also good! (for example rule replace, ops->validate callback error scenario) And there is no warning(after-use-free-access and others). > Subject: netfilter: nf_tables: nft_compat: fix refcount leak on xt module > > Taehee Yoo reported following bug: > iptables-compat -I OUTPUT -m cpu --cpu 0 > iptables-compat -F > lsmod |grep xt_cpu > xt_cpu 16384 1 > > Quote: > "When above command is given, a netlink message has two expressions that > are the cpu compat and the nft_counter. > The nft_expr_type_get() in the nf_tables_expr_parse() successes > first expression then, calls select_ops callback. > (allocates memory and holds module) > But, second nft_expr_type_get() in the nf_tables_expr_parse() > returns -EAGAIN because of request_module(). > In that point, by the 'goto err1', > the 'module_put(info[i].ops->type->owner)' is called. > There is no release routine." > > The problem is that unlike all other expressions, > nft_compat select_ops has side effects. > > 1. it allocates dynamic memory which holds an nft ops struct. > In all other expressions, ops has static storage duration. > 2. It grabs references to the xt module that it is supposed to > invoke. > > Depending on where things go wrong, error unwinding doesn't > always do the right thing. > > In the above scenario, a new nft_compat_expr is created and > xt_cpu module gets loaded with a refcount of 1. > > Due to to -EAGAIN, the netlink messages get re-parsed. > When that happens, nft_compat finds that xt_cpu is already present > and increments module refcount again. > > This fixes the problem by making select_ops to have no visible > side effects and removes all extra module_get/put. > > When select_ops creates a new nft_compat expression, the new > expression has a refcount of 0, and the xt module gets its refcount > incremented. > > When error happens, the next call finds existing entry, but will no > longer increase the reference count -- the presence of existing > nft_xt means we already hold a module reference. > > Because nft_xt_put is only called from nft_compat destroy hook, > it will never see the initial zero reference count. > ->destroy can only be called after ->init(), and that will increase the > refcount. > > Lastly, we now free nft_xt struct with kfree_rcu. > Else, we get use-after free in nf_tables_rule_destroy: > > while (expr != nft_expr_last(rule) && expr->ops) { > nf_tables_expr_destroy(ctx, expr); > expr = nft_expr_next(expr); // here > > nft_expr_next() dereferences expr->ops. This is safe > for all users, as ops have static storage duration. > In nft_compat case however, its ->destroy callback can > free the memory that hold the ops structure. > > Reported-by: Taehee Yoo <ap420073@xxxxxxxxx> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > --- > net/netfilter/nft_compat.c | 94 +++++++++++++++++++++++++++++----------------- > 1 file changed, 60 insertions(+), 34 deletions(-) > > diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c > index 8e23726b9081..6c96b32b4187 100644 > --- a/net/netfilter/nft_compat.c > +++ b/net/netfilter/nft_compat.c > @@ -27,14 +27,26 @@ struct nft_xt { > struct list_head head; > struct nft_expr_ops ops; > unsigned int refcnt; > + > + /* Unlike other expressions, ops doesn't have static storage duration > + * here, but nft core assumes they do, in particular it assumes ops remain > + * valid after calling ->destroy(), but in nft_compat case this could free > + * ops as well. > + * > + * Its enough to delay free operation with kfree_rcu. > + */ > + struct rcu_head rcu_head; > }; > > -static void nft_xt_put(struct nft_xt *xt) > +static bool nft_xt_put(struct nft_xt *xt) > { > if (--xt->refcnt == 0) { > list_del(&xt->head); > - kfree(xt); > + kfree_rcu(xt, rcu_head); > + return true; > } > + > + return false; > } > > static int nft_compat_chain_validate_dependency(const char *tablename, > @@ -226,6 +238,7 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr, > struct xt_target *target = expr->ops->data; > struct xt_tgchk_param par; > size_t size = XT_ALIGN(nla_len(tb[NFTA_TARGET_INFO])); > + struct nft_xt *nft_xt; > u16 proto = 0; > bool inv = false; > union nft_entry e = {}; > @@ -236,25 +249,22 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr, > if (ctx->nla[NFTA_RULE_COMPAT]) { > ret = nft_parse_compat(ctx->nla[NFTA_RULE_COMPAT], &proto, &inv); > if (ret < 0) > - goto err; > + return ret; > } > > nft_target_set_tgchk_param(&par, ctx, target, info, &e, proto, inv); > > ret = xt_check_target(&par, size, proto, inv); > if (ret < 0) > - goto err; > + return ret; > > /* The standard target cannot be used */ > - if (target->target == NULL) { > - ret = -EINVAL; > - goto err; > - } > + if (target->target == NULL) > + return -EINVAL; > > + nft_xt = container_of(expr->ops, struct nft_xt, ops); > + nft_xt->refcnt++; > return 0; > -err: > - module_put(target->me); > - return ret; > } > > static void > @@ -271,8 +281,8 @@ nft_target_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr) > if (par.target->destroy != NULL) > par.target->destroy(&par); > > - nft_xt_put(container_of(expr->ops, struct nft_xt, ops)); > - module_put(target->me); > + if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops))) > + module_put(target->me); > } > > static int nft_target_dump(struct sk_buff *skb, const struct nft_expr *expr) > @@ -411,6 +421,7 @@ nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr, > struct xt_match *match = expr->ops->data; > struct xt_mtchk_param par; > size_t size = XT_ALIGN(nla_len(tb[NFTA_MATCH_INFO])); > + struct nft_xt *nft_xt; > u16 proto = 0; > bool inv = false; > union nft_entry e = {}; > @@ -421,19 +432,18 @@ nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr, > if (ctx->nla[NFTA_RULE_COMPAT]) { > ret = nft_parse_compat(ctx->nla[NFTA_RULE_COMPAT], &proto, &inv); > if (ret < 0) > - goto err; > + return ret; > } > > nft_match_set_mtchk_param(&par, ctx, match, info, &e, proto, inv); > > ret = xt_check_match(&par, size, proto, inv); > if (ret < 0) > - goto err; > + return ret; > > + nft_xt = container_of(expr->ops, struct nft_xt, ops); > + nft_xt->refcnt++; > return 0; > -err: > - module_put(match->me); > - return ret; > } > > static void > @@ -450,8 +460,8 @@ nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr) > if (par.match->destroy != NULL) > par.match->destroy(&par); > > - nft_xt_put(container_of(expr->ops, struct nft_xt, ops)); > - module_put(match->me); > + if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops))) > + module_put(match->me); > } > > static int nft_match_dump(struct sk_buff *skb, const struct nft_expr *expr) > @@ -654,13 +664,8 @@ nft_match_select_ops(const struct nft_ctx *ctx, > list_for_each_entry(nft_match, &nft_match_list, head) { > struct xt_match *match = nft_match->ops.data; > > - if (nft_match_cmp(match, mt_name, rev, family)) { > - if (!try_module_get(match->me)) > - return ERR_PTR(-ENOENT); > - > - nft_match->refcnt++; > + if (nft_match_cmp(match, mt_name, rev, family)) > return &nft_match->ops; > - } > } > > match = xt_request_find_match(family, mt_name, rev); > @@ -679,7 +684,7 @@ nft_match_select_ops(const struct nft_ctx *ctx, > goto err; > } > > - nft_match->refcnt = 1; > + nft_match->refcnt = 0; > nft_match->ops.type = &nft_match_type; > nft_match->ops.size = NFT_EXPR_SIZE(XT_ALIGN(match->matchsize)); > nft_match->ops.eval = nft_match_eval; > @@ -739,13 +744,8 @@ nft_target_select_ops(const struct nft_ctx *ctx, > list_for_each_entry(nft_target, &nft_target_list, head) { > struct xt_target *target = nft_target->ops.data; > > - if (nft_target_cmp(target, tg_name, rev, family)) { > - if (!try_module_get(target->me)) > - return ERR_PTR(-ENOENT); > - > - nft_target->refcnt++; > + if (nft_target_cmp(target, tg_name, rev, family)) > return &nft_target->ops; > - } > } > > target = xt_request_find_target(family, tg_name, rev); > @@ -764,7 +764,7 @@ nft_target_select_ops(const struct nft_ctx *ctx, > goto err; > } > > - nft_target->refcnt = 1; > + nft_target->refcnt = 0; > nft_target->ops.type = &nft_target_type; > nft_target->ops.size = NFT_EXPR_SIZE(XT_ALIGN(target->targetsize)); > nft_target->ops.init = nft_target_init; > @@ -823,6 +823,32 @@ static int __init nft_compat_module_init(void) > > static void __exit nft_compat_module_exit(void) > { > + struct nft_xt *xt, *next; > + > + /* list should be empty here, it can be non-empty only in case there > + * was an error that caused nft_xt expr to not be initialized fully > + * and noone else requested the same expression later. > + * > + * In this case, the lists contain 0-refcount entries that still > + * hold module reference. > + */ > + list_for_each_entry_safe(xt, next, &nft_target_list, head) { > + struct xt_target *target = xt->ops.data; > + > + if (WARN_ON_ONCE(xt->refcnt)) > + continue; > + module_put(target->me); > + kfree(xt); > + } > + > + list_for_each_entry_safe(xt, next, &nft_match_list, head) { > + struct xt_match *match = xt->ops.data; > + > + if (WARN_ON_ONCE(xt->refcnt)) > + continue; > + module_put(match->me); > + kfree(xt); > + } > nfnetlink_subsys_unregister(&nfnl_compat_subsys); > nft_unregister_expr(&nft_target_type); > nft_unregister_expr(&nft_match_type); > -- > 2.16.1 > -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html