On Fri, Jul 31, 2020 at 09:59:33PM +0100, Matthew Wilcox wrote: > On Fri, Jul 31, 2020 at 01:48:55PM -0700, Paul E. McKenney wrote: > > On Fri, Jul 31, 2020 at 01:38:34PM -0700, Andrew Morton wrote: > > > On Thu, 30 Jul 2020 16:12:05 -0700 "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote: > > > > > > > So, may we add a GFP_ flag that will cause kmalloc() and friends to return > > > > NULL when they would otherwise need to acquire their non-raw spinlock? > > > > This avoids adding any overhead to the slab-allocator fastpaths, but > > > > allows callback invocation to reduce cache misses without having to > > > > restructure some existing callers of call_rcu() and potential future > > > > callers of kfree_rcu(). > > > > > > We have eight free gfp_t bits so that isn't a problem. > > > > Whew!!! ;-) > > > > > Adding a test-n-branch to the kmalloc() fastpath may well be a concern. > > > > > > Which of mm/sl?b.c are affected? > > > > None of them, it turns out. The initial patch will instead directly > > invoke __get_free_page(). So we could just leave sl?b.c alone. > > Isn't that spelled GFP_NOWAIT? I don't think so in the current kernel, though I might be confused. The problem we are having isn't waiting, but rather normal spinlock_t acquisition. This does not count as waiting in !CONFIG_PREEMPT_RT kernels, and so there are code paths that acquire the non-raw zone_lock in rmqueue_bulk() even in the GFP_NOWAIT case. Because kfree_rcu() and call_rcu() and their callers might hold raw spinlocks, acquiring a non-raw spinlock is forbidden for them and for anything that they call, directly or indirectly. The reason for this restriction is that in -rt, the spin_lock(&zone->lock) in rmqueue_bulk() can sleep. This conversion of non-raw spinlocks to sleeplocks is part of how -rt reduces scheduling latency. Because acquiring a raw spinlock disables preemption (even in -rt), acquiring a non-raw spinlock while holding a raw spinlock gets you "scheduling while atomic" in -rt. And it will get you lockdep complaints in all kernels, not just -rt, when CONFIG_PROVE_RAW_LOCK_NESTING is enabled. And my guess is that CONFIG_PROVE_RAW_LOCK_NESTING=y will become the default sooner rather than later. But you are right that yet another approach might be modifying the GFP_NOWAIT handling so that it avoided acquiring non-raw spinlocks. However, evaluating that option requires quite a bit more knowledge of MM than I have! ;-) Thanx, Paul