On Thu, Oct 26, 2023 at 10:26:35AM +0200, Pablo Neira Ayuso wrote: > On Thu, Oct 26, 2023 at 10:15:33AM +0200, Pablo Neira Ayuso wrote: > > Cc'ing Florian. > > > > On Wed, Oct 25, 2023 at 11:00:14PM +0200, Pablo Neira Ayuso wrote: > > > On Wed, Oct 25, 2023 at 10:08:28PM +0200, Phil Sutter wrote: > > > > Objects' dump callbacks are not concurrency-safe per-se with reset bit > > > > set. If two CPUs perform a reset at the same time, at least counter and > > > > quota objects suffer from value underrun. > > > > > > > > Prevent this by introducing dedicated locking callbacks for nfnetlink > > > > and the asynchronous dump handling to serialize access. > > > > > > > > Signed-off-by: Phil Sutter <phil@xxxxxx> > > > > --- > > > > net/netfilter/nf_tables_api.c | 72 ++++++++++++++++++++++++++++------- > > > > 1 file changed, 59 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > > > > index 5f84bdd40c3f..245a2c5be082 100644 > > > > --- a/net/netfilter/nf_tables_api.c > > > > +++ b/net/netfilter/nf_tables_api.c > > > [...] > > > > @@ -7832,16 +7876,18 @@ static int nf_tables_getobj(struct sk_buff *skb, const struct nfnl_info *info, > > > > return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c); > > > > } > > > > > > > > - if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETOBJ_RESET) > > > > - reset = true; > > > > + if (!try_module_get(THIS_MODULE)) > > > > + return -EINVAL; > > > > > > For netlink dump path, __netlink_dump_start() already grabs a > > > reference module this via c->module. > > > > > > Why is this module reference needed for getting one object? This does > > > not follow netlink dump path, it creates the skb and it returns > > > inmediately. > > > > nfnetlink callbacks use nfnetlink_get_subsys() which use > > rcu_dereference() to fetch the nfnetlink_subsystem callbacks. In > > nfnetlink_rcv_batch() the ss pointer is fetched at the beginning of > > the batch processing. > > Correction: This is nfnetlink_rcv_msg() path, not nfnetlink_rcv_batch() > path because this is a _GET command which should not ever follow > nfnetlink_rcv_batch() path. > > But still the reason below is possible, considering a skb that > contains two _GET requests (which is possible because netlink supports > for non-atomic batches, ie. stacking several netlink messages in one > sendmsg() call). Scratch this. nfnetlink_rcv_msg() is called for each netlink message, then the nfnetlink_subsystem pointer are re-fetch. In summary: the try_module_get() before rcu_read_unlock() from netlink get/dump is safe. Sorry for the noise.