When this_cpu changes in the free path node needs to change too. Otherwise the slab can end up in the wrong node's list and this eventually leads to WARN_ONs and of course worse NUMA performace. This patch is likely not complete (the NUMA slab code is *very* hairy), but seems to make the make -j128 test survive for at least two hours. But at least it fixes one case that regularly triggered during testing, resulting in slabs in the wrong node lists and triggering WARN_ONs in slab_put/get_obj I tried a complete audit of keeping this_cpu/node/slabp in sync when needed, but it is very hairy code and I likely missed some cases. This so far fixes only the simple free path; but it seems to be good enough to not trigger easily anymore on a NUMA system with memory pressure. Longer term the only good fix is probably to migrate to slub. Or disable NUMA slab for PREEMPT_RT (its value has been disputed in some benchmarks anyways) Signed-off-by: Andi Kleen <ak@xxxxxxx> Index: linux-2.6.23-rt1/mm/slab.c =================================================================== --- linux-2.6.23-rt1.orig/mm/slab.c +++ linux-2.6.23-rt1/mm/slab.c @@ -1193,7 +1193,7 @@ cache_free_alien(struct kmem_cache *cach struct array_cache *alien = NULL; int node; - node = numa_node_id(); + node = cpu_to_node(*this_cpu); /* * Make sure we are not freeing a object from another node to the array @@ -4194,6 +4194,8 @@ static void cache_reap(struct work_struc work_done += reap_alien(searchp, l3, &this_cpu); + node = cpu_to_node(this_cpu); + work_done += drain_array(searchp, l3, cpu_cache_get(searchp, this_cpu), 0, node); - To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html