Hi Phil, On Tue, Sep 26, 2023 at 11:34:43AM +0200, Phil Sutter wrote: > On Mon, Sep 25, 2023 at 09:53:17PM +0200, Florian Westphal wrote: > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > On Sat, Sep 23, 2023 at 06:18:13PM +0200, Florian Westphal wrote: > > > > callback_that_might_reset() > > > > { > > > > try_module_get ... > > > > rcu_read_unlock() > > > > mutex_lock(net->commit_mutex) > > > > dumper(); > > > > mutex_unlock(net->commit_mutex) > > > > rcu_read_lock(); > > > > module_put() > > > > } > > > > > > > > should do the trick. > > > > > > Idiom above LGTM, *except for net->commit_mutex*. Please do not use > > > ->commit_mutex: This will stall ruleset updates for no reason, netlink > > > dump would grab and release such mutex for each netlink_recvmsg() call > > > and netlink dump side will always retry because of NLM_F_EINTR. > > > > It will stall updates, but for good reason: we are making changes to the > > expressions state. > > This also disqualifies the use of Pablo's suggested reset_lock, right? Quick summary: We are currently discussing if it makes sense to add a new lock or not. The commit_mutex stalls updates, but netlink dumps retrieves listings in chunks, that is, one recvmsg() call from userspace (to retrieve one list chunk) will grab the mutex then release it until the next recvmsg() call is done. Between these two calls an update is still possible. The question is if it is worth to stall an ongoing listing or updates. There is the NLM_F_EINTR mechanism in place that tells that an interference has occured while keeping the listing lockless. Unless I am missing anything, the goal is to fix two different processes that are listing at the same time, that is, two processes running a netlink dump at the same time that are resetting the stateful expressions in the ruleset.