On Tue, May 29, 2018 at 05:12:04PM -0700, Shakeel Butt wrote: > The memcg kmem cache creation and deactivation (SLUB only) is > asynchronous. If a root kmem cache is destroyed whose memcg cache is in > the process of creation or deactivation, the kernel may crash. > > Example of one such crash: > general protection fault: 0000 [#1] SMP PTI > CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp > ... > Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn > RIP: 0010:has_cpu_slab > ... > Call Trace: > ? on_each_cpu_cond > __kmem_cache_shrink > kmemcg_cache_deact_after_rcu > kmemcg_deactivate_workfn > process_one_work > worker_thread > kthread > ret_from_fork+0x35/0x40 > > To fix this race, on root kmem cache destruction, mark the cache as > dying and flush the workqueue used for memcg kmem cache creation and > deactivation. > @@ -845,6 +862,8 @@ void kmem_cache_destroy(struct kmem_cache *s) > if (unlikely(!s)) > return; > > + flush_memcg_workqueue(s); > + This should definitely help against async memcg_kmem_cache_create(), but I'm afraid it doesn't eliminate the race with async destruction, unfortunately, because the latter uses call_rcu_sched(): memcg_deactivate_kmem_caches __kmem_cache_deactivate slab_deactivate_memcg_cache_rcu_sched call_rcu_sched kmem_cache_destroy shutdown_memcg_caches shutdown_cache memcg_deactivate_rcufn <dereference destroyed cache> Can we somehow flush those pending rcu requests?