On Thu, 28 Sep 2023, Pablo Neira Ayuso wrote: > On Thu, Sep 28, 2023 at 08:57:45PM +0200, Florian Westphal wrote: > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > On Thu, Sep 28, 2023 at 07:46:30PM +0200, Florian Westphal wrote: > > > > Phil Sutter <phil@xxxxxx> wrote: > > > > > +static int nf_tables_dumpreset_set(struct sk_buff *skb, > > > > > + struct netlink_callback *cb) > > > > > +{ > > > > > + struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk)); > > > > > + struct nft_set_dump_ctx *dump_ctx = cb->data; > > > > > + int ret, skip = cb->args[0]; > > > > > + > > > > > + mutex_lock(&nft_net->commit_mutex); > > > > > + ret = nf_tables_dump_set(skb, cb); > > > > > + mutex_unlock(&nft_net->commit_mutex); > > > > > + > > > > > + if (cb->args[0] > skip) > > > > > + audit_log_nft_set_reset(dump_ctx->ctx.table, cb->seq, > > > > > + cb->args[0] - skip); > > > > > + > > > > > > > > Once commit_mutex is dropped, parallel user can > > > > delete table, and ctx.table references garbage. > > > > > > This path should hold rcu read lock. > > > > Then it would splat on first mutex_lock(). > > > > > I think spinlock would be better, we would just spin for very little > > > time here for another thread to complete the reset, and the race is > > > fixed. > > > > > > The use of commit_mutex here is confusing is really misleading to the > > > reader, this is also not the commit path. > > > > I'd say its semantically the same thing, we alter state. > > > > We even emit audit records to tell userspace that there is state > > change. > > This is a netlink event, how does the mutex help in that regard? > > > Also, are you sure spinlock is appropriate here? > > > > For full-ruleset resets we might be doing quite some > > traverals. > > I said several times, we grab and release this for each > netlink_recvmsg(), commit_mutex helps us in no way to fix the "two > concurrent dump-and-reset problem". > > One concern might be deadlock due to reordering, but I don't see how > that can happen. The same problem exists ipset: when a set is listed/saved (dumped), concurrent destroy/rename/swap for the same set must be excluded. As neither spinlock nor mutex helps, a reference counter is used: the start of the dump increases it and by checking it all concurrent events can safely be rejected by returning EBUSY. Best regards, Jozsef - E-mail : kadlec@xxxxxxxxxxxxxxxxx, kadlecsik.jozsef@xxxxxxxxx PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary