On Fri, Aug 14, 2020 at 07:14:25AM -0700, Paul E. McKenney wrote: > Doing this to kfree_rcu() is the first step. We will also be doing this > to call_rcu(), which has some long-standing invocations from various > raw contexts, including hardirq handler. Most hardirq handler are not raw on RT due to threaded interrupts. Lockdep knows about this. > > > 4) As kfree_rcu() can be used from any context except NMI and RT > > > relies on it we ran into a circular dependency problem. > > > > Well, which actual usage sites are under a raw spinlock? Most of the > > ones I could find are not. > > There are some on their way in, but this same optimization will be applied > to call_rcu(), and there are no shortage of such call sites in that case. > And these call sites have been around for a very long time. Luckily there is lockdep to help you find the ones that need converting to raw_call_rcu() :-) > > > Clearly the simplest solution but not Pauls favourite and > > > unfortunately he has a good reason. > > > > Which isn't actually stated anywhere I suppose ? > > Several times, but why not one more? ;-) > > In CONFIG_PREEMPT_NONE=y kernels, which are heavily used, and for which > the proposed kfree_rcu() and later call_rcu() optimizations are quite > important, there is no way to tell at runtime whether or you are in > atomic raw context. CONFIG_PREEMPT_NONE has nothing what so ever to do with any of this. > > > > 2. Adding a GFP_ flag. > > > > > > Michal does not like the restriction for !RT kernels and tries to > > > avoid the introduction of a new allocation mode. > > > > Like above, I tend to be with Michal on this, just wrap the actual > > allocation in CONFIG_PREEMPT_RT, the code needs to handle a NULL pointer > > there anyway. > > That disables the optimization in the CONFIG_PREEMPT_NONE=y case, > where it really is needed. No, it disables it for CONFIG_PREEMPT_RT. > I would be OK with either. In CONFIG_PREEMPT_NONE=n kernels, the > kfree_rcu() code could use preemptible() to determine whether it was safe > to invoke the allocator. The code in kfree_rcu() might look like this: > > mem = NULL; > if (IS_ENABLED(CONFIG_PREEMPT_NONE) || preemptible()) > mem = __get_free_page(...); > > Is your point is that the usual mistakes would then be caught by the > usual testing on CONFIG_PREEMPT_NONE=n kernels? mem = NULL; #if !defined(CONFIG_PREEMPT_RT) && !defined(CONFIG_PROVE_LOCKING) mem = __get_free_page(...) #endif if (!mem) But I _really_ would much rather have raw_kfree_rcu() and raw_call_rcu() variants for the few places that actually need it. > > > These are not seperate of each other as #3 requires #4. The most > > > horrible solution IMO from a technical POV as it proliferates > > > inconsistency for no good reaosn. > > > > > > Aside of that it'd be solving a problem which does not exist simply > > > because kfree_rcu() does not depend on it and we really don't want to > > > set precedence and encourage the (ab)use of this in any way. > > > > My preferred solution is 1, if you want to use an allocator, you simply > > don't get to play under raw_spinlock_t. And from a quick grep, most > > kfree_rcu() users are not under raw_spinlock_t context. > > There is at least one on its way in, but more to the point, we will > need to apply this same optimization to call_rcu(), which is used in There is no need, call_rcu() works perfectly fine today, thank you. You might want to, but that's an entirely different thing. > raw atomic context, including from hardirq handlers. Threaded IRQs. There really is very little code that is 'raw' on RT.