On Wed, Nov 30, 2022 at 11:05 PM David Howells <dhowells@xxxxxxxxxx> wrote: > > Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > Note that this conflicts with my patch: > > > > > > rxrpc: Don't hold a ref for connection workqueue > > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=rxrpc-next&id=450b00011290660127c2d76f5c5ed264126eb229 > > > > > > which should render it unnecessary. It's a little ahead of yours in the > > > net-next queue, if that means anything. > > > > Could you clarify why it is unnecessary? > > Rather than tearing down parts of the connection it only logs a trace line, > frees the memory and decrements the counter on the namespace. This it used to > account that all the pieces of memory allocated in that namespace are gone > before the namespace is removed to check for leaks. The RCU cleanup used to > use some other stuff (such as the peer hash) in the rxrpc_net struct but no > longer will after the patches I submitted. > > > After your patch, you are still doing a wake up in your call_rcu() callback: > > > > - ASSERTCMP(refcount_read(&conn->ref), ==, 0); > > + if (atomic_dec_and_test(&rxnet->nr_conns)) > > + wake_up_var(&rxnet->nr_conns); > > +} > > > > Are you saying the code can now tolerate delays? What if the RCU > > callback is invoked after arbitrarily long delays making the sleeping > > process to wait? > > True. But that now only holds up the destruction of a net namespace and the > removal of the rxrpc module. > > > If you agree, you can convert the call_rcu() to call_rcu_hurry() in > > your patch itself. Would you be willing to do that? If not, that's > > totally OK and I can send a patch later once yours is in (after > > further testing). > > I can add it to part 4 (see my rxrpc-ringless-5 branch) if it is necessary. Ok sounds good, on module removal the rcu_barrier() will flush out pending callbacks so that should not be an issue. Based on your message, I think we can drop this patch then. Since Paul is already dropping it, no other action is needed. (I just realized my patch was not fixing a test failure, like the other net ones did, but rather we found the issue by static analysis -- i.e. programmatically auditing all callbacks in the kernel doing wake ups). thanks, - Joel