On 2023/10/18 14:34, Hyeonggon Yoo wrote: > On Wed, Oct 18, 2023 at 12:45 AM <chengming.zhou@xxxxxxxxx> wrote: >> 4. Testing >> ========== >> We just did some simple testing on a server with 128 CPUs (2 nodes) to >> compare performance for now. >> >> - perf bench sched messaging -g 5 -t -l 100000 >> baseline RFC >> 7.042s 6.966s >> 7.022s 7.045s >> 7.054s 6.985s >> >> - stress-ng --rawpkt 128 --rawpkt-ops 100000000 >> baseline RFC >> 2.42s 2.15s >> 2.45s 2.16s >> 2.44s 2.17s >> >> It shows above there is about 10% improvement on stress-ng rawpkt >> testcase, although no much improvement on perf sched bench testcase. >> >> Thanks for any comment and code review! > > Hi Chengming, this is the kerneltesting.org test report for your patch series. > > I applied this series on my slab-experimental tree [1] for testing, > and I observed several kernel panics [2] [3] [4] on kernels without > CONFIG_SLUB_CPU_PARTIAL. > > To verify that this series caused kernel panics, I tested before and after > applying it on Vlastimil's slab/for-next and yeah, this series was the cause. > > System is deadlocked on memory and the OOM-killer says there is a > huge amount of slab memory. So maybe there is a memory leak or it makes > slab memory grow unboundedly? Thanks for the testing! I can reproduce the OOM locally without CONFIG_SLUB_CPU_PARTIAL. I made a quick fix below (will need to get another better fix). The root cause is in patch-4, which wrongly put some partial slabs onto the CPU partial list even without CONFIG_SLUB_CPU_PARTIAL. So these partial slabs are leaked. diff --git a/mm/slub.c b/mm/slub.c index d58eaf8447fd..b7ba6c008122 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2339,12 +2339,12 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, } } +#ifdef CONFIG_SLUB_CPU_PARTIAL remove_partial(n, slab); put_cpu_partial(s, slab, 0); stat(s, CPU_PARTIAL_NODE); partial_slabs++; -#ifdef CONFIG_SLUB_CPU_PARTIAL if (!kmem_cache_has_cpu_partial(s) || partial_slabs > s->cpu_partial_slabs / 2) break; > > [1] https://git.kerneltesting.org/slab-experimental/ > [2] https://lava.kerneltesting.org/scheduler/job/127#bottom > [3] https://lava.kerneltesting.org/scheduler/job/131#bottom > [4] https://lava.kerneltesting.org/scheduler/job/134#bottom > >> >> Chengming Zhou (5): >> slub: Introduce on_partial() >> slub: Don't manipulate slab list when used by cpu >> slub: Optimize deactivate_slab() >> slub: Don't freeze slabs for cpu partial >> slub: Introduce get_cpu_partial() >> >> mm/slab.h | 2 +- >> mm/slub.c | 257 +++++++++++++++++++++++++++++++----------------------- >> 2 files changed, 150 insertions(+), 109 deletions(-) >> >> -- >> 2.40.1 >>