On Thu, Nov 02, 2023 at 10:35:32PM +0100, Pablo Neira Ayuso wrote: > On Thu, Nov 02, 2023 at 03:59:53PM +0100, Phil Sutter wrote: > > Set expressions' dump callbacks are not concurrency-safe per-se with > > reset bit set. If two CPUs reset the same element at the same time, > > values may underrun at least with element-attached counters and quotas. > > > > Prevent this by introducing dedicated callbacks for nfnetlink and the > > asynchronous dump handling to serialize access. > > > > Signed-off-by: Phil Sutter <phil@xxxxxx> > > --- > > Changes since v2: > > - Move the audit_log_nft_set_reset() call into the critical section to > > protect the table pointer dereference. > > - Drop unused nelems variable from (non-reset) nf_tables_getsetelem(). > > --- > > net/netfilter/nf_tables_api.c | 109 +++++++++++++++++++++++++++++----- > > 1 file changed, 94 insertions(+), 15 deletions(-) > > Adds quite a bit of code: I guess because of the copy and paste to add > nf_tables_getsetelem_reset(). Having to sort the locking issue mentioned below allowed for a bit of code deduplication, I hope v4 looks better in this regard. [...] > > @@ -6093,17 +6105,84 @@ static int nf_tables_getsetelem(struct sk_buff *skb, > > return -EINVAL; > > > > nla_for_each_nested(attr, nla[NFTA_SET_ELEM_LIST_ELEMENTS], rem) { > > - err = nft_get_set_elem(&ctx, set, attr, reset); > > + err = nft_get_set_elem(&ctx, set, attr, false); > > + if (err < 0) { > > + NL_SET_BAD_ATTR(extack, attr); > > + break; > > + } > > + } > > + > > + return err; > > +} > > + > > +static int nf_tables_getsetelem_reset(struct sk_buff *skb, > > + const struct nfnl_info *info, > > + const struct nlattr * const nla[]) > > +{ > > + struct nftables_pernet *nft_net = nft_pernet(info->net); > > + struct netlink_ext_ack *extack = info->extack; > > + u8 genmask = nft_genmask_cur(info->net); > > + u8 family = info->nfmsg->nfgen_family; > > + int rem, err = 0, nelems = 0; > > + struct net *net = info->net; > > + struct nft_table *table; > > + struct nft_set *set; > > + struct nlattr *attr; > > + struct nft_ctx ctx; > > + > > + table = nft_table_lookup(net, nla[NFTA_SET_ELEM_LIST_TABLE], family, > > + genmask, 0); > > + if (IS_ERR(table)) { > > + NL_SET_BAD_ATTR(extack, nla[NFTA_SET_ELEM_LIST_TABLE]); > > + return PTR_ERR(table); > > + } > > + > > + set = nft_set_lookup(table, nla[NFTA_SET_ELEM_LIST_SET], genmask); > > + if (IS_ERR(set)) > > + return PTR_ERR(set); > > + > > + nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla); > > + > > + if (info->nlh->nlmsg_flags & NLM_F_DUMP) { > > + struct netlink_dump_control c = { > > + .start = nf_tables_dump_set_start, > > + .dump = nf_tables_dumpreset_set, > > + .done = nf_tables_dump_set_done, > > + .module = THIS_MODULE, > > + }; > > + struct nft_set_dump_ctx dump_ctx = { > > + .set = set, > > + .ctx = ctx, > > + .reset = true, > > + }; > > + > > + c.data = &dump_ctx; > > + return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c); > > + } > > + > > + if (!nla[NFTA_SET_ELEM_LIST_ELEMENTS]) > > + return -EINVAL; > > + > > + if (!try_module_get(THIS_MODULE)) > > + return -EINVAL; > > + rcu_read_unlock(); > > Existing table and set pointer get invalid from here on, after leaving > rcu read side lock. You're right. I just submitted v4 which should fix that by performing the needed lookups inside the critical section. > > + mutex_lock(&nft_net->commit_mutex); > > + rcu_read_lock(); > > grab mutex and rcu at the same time, back again? Yes, this is required AFAICT because the lookup callback of nft_set_rhash_type wants to be called with RCU read lock held. Cheers, Phil