On 4/11/19 5:26 AM, Qian Cai wrote: > "cat /proc/slab_allocators" could hang forever on SMP machines with > kmemleak or object debugging enabled due to other CPUs running do_drain() > will keep making kmemleak_object or debug_objects_cache dirty and unable > to escape the first loop in leaks_show(), So what if we don't remove SLAB (yet?) but start removing the debugging functionality that has been broken for years and nobody noticed. I think Linus already mentioned that we remove at least the /proc/slab_allocators file... > do { > set_store_user_clean(cachep); > drain_cpu_caches(cachep); > ... > > } while (!is_store_user_clean(cachep)); > > For example, > > do_drain > slabs_destroy > slab_destroy > kmem_cache_free > __cache_free > ___cache_free > kmemleak_free_recursive > delete_object_full > __delete_object > put_object > free_object_rcu > kmem_cache_free > cache_free_debugcheck --> dirty kmemleak_object > > One approach is to check cachep->name and skip both kmemleak_object and > debug_objects_cache in leaks_show(). The other is to set > store_user_clean after drain_cpu_caches() which leaves a small window > between drain_cpu_caches() and set_store_user_clean() where per-CPU > caches could be dirty again lead to slightly wrong information has been > stored but could also speed up things significantly which sounds like a > good compromise. For example, > > # cat /proc/slab_allocators > 0m42.778s # 1st approach > 0m0.737s # 2nd approach > > Fixes: d31676dfde25 ("mm/slab: alternative implementation for DEBUG_SLAB_LEAK") > Signed-off-by: Qian Cai <cai@xxxxxx> > --- > mm/slab.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/mm/slab.c b/mm/slab.c > index 9142ee992493..3e1b7ff0360c 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -4328,8 +4328,12 @@ static int leaks_show(struct seq_file *m, void *p) > * whole processing. > */ > do { > - set_store_user_clean(cachep); > drain_cpu_caches(cachep); > + /* > + * drain_cpu_caches() could always make kmemleak_object and > + * debug_objects_cache dirty, so reset afterwards. > + */ > + set_store_user_clean(cachep); > > x[1] = 0; > >