Get rid of preempt_disable() and preempt_enable() when the preload is done for splitting purpose. The reason is that calling spin_lock() with disabled preemtion is forbidden in CONFIG_PREEMPT_RT kernel. Therefore, we do not guarantee that a CPU is preloaded, instead we minimize the case when it is not with this change. For example i run the special test case that follows the preload pattern and path. 20 "unbind" threads run it and each does 1000000 allocations. Only 3.5 times among 1000000 a CPU was not preloaded. So it can happen but the number is negligible. V1 -> V2: - move __this_cpu_cmpxchg check when spin_lock is taken, as proposed by Andrew Morton - add more explanation in regard of preloading - adjust and move some comments Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split purpose") Reviewed-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx> Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> --- mm/vmalloc.c | 50 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 17 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index e92ff5f7dd8b..f48cd0711478 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -969,6 +969,19 @@ adjust_va_to_fit_type(struct vmap_area *va, * There are a few exceptions though, as an example it is * a first allocation (early boot up) when we have "one" * big free space that has to be split. + * + * Also we can hit this path in case of regular "vmap" + * allocations, if "this" current CPU was not preloaded. + * See the comment in alloc_vmap_area() why. If so, then + * GFP_NOWAIT is used instead to get an extra object for + * split purpose. That is rare and most time does not + * occur. + * + * What happens if an allocation gets failed. Basically, + * an "overflow" path is triggered to purge lazily freed + * areas to free some memory, then, the "retry" path is + * triggered to repeat one more time. See more details + * in alloc_vmap_area() function. */ lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); if (!lva) @@ -1078,31 +1091,34 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, retry: /* - * Preload this CPU with one extra vmap_area object to ensure - * that we have it available when fit type of free area is - * NE_FIT_TYPE. + * Preload this CPU with one extra vmap_area object. It is used + * when fit type of free area is NE_FIT_TYPE. Please note, it + * does not guarantee that an allocation occurs on a CPU that + * is preloaded, instead we minimize the case when it is not. + * It can happen because of migration, because there is a race + * until the below spinlock is taken. * * The preload is done in non-atomic context, thus it allows us * to use more permissive allocation masks to be more stable under - * low memory condition and high memory pressure. + * low memory condition and high memory pressure. In rare case, + * if not preloaded, GFP_NOWAIT is used. * - * Even if it fails we do not really care about that. Just proceed - * as it is. "overflow" path will refill the cache we allocate from. + * Set "pva" to NULL here, because of "retry" path. */ - preempt_disable(); - if (!__this_cpu_read(ne_fit_preload_node)) { - preempt_enable(); - pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); - preempt_disable(); + pva = NULL; - if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { - if (pva) - kmem_cache_free(vmap_area_cachep, pva); - } - } + if (!this_cpu_read(ne_fit_preload_node)) + /* + * Even if it fails we do not really care about that. + * Just proceed as it is. If needed "overflow" path + * will refill the cache we allocate from. + */ + pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); spin_lock(&vmap_area_lock); - preempt_enable(); + + if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) + kmem_cache_free(vmap_area_cachep, pva); /* * If an allocation fails, the "vend" address is -- 2.20.1