On Tue, Jan 17, 2017 at 03:54:04PM -0800, Tejun Heo wrote: > With kmem cgroup support enabled, kmem_caches can be created and > destroyed frequently and a great number of near empty kmem_caches can > accumulate if there are a lot of transient cgroups and the system is > not under memory pressure. When memory reclaim starts under such > conditions, it can lead to consecutive deactivation and destruction of > many kmem_caches, easily hundreds of thousands on moderately large > systems, exposing scalability issues in the current slab management > code. This is one of the patches to address the issue. > > SLAB_DESTORY_BY_RCU caches need to flush all RCU operations before > destruction because slab pages are freed through RCU and they need to > be able to dereference the associated kmem_cache. Currently, it's > done synchronously with rcu_barrier(). As rcu_barrier() is expensive > time-wise, slab implements a batching mechanism so that rcu_barrier() > can be done for multiple caches at the same time. > > Unfortunately, the rcu_barrier() is in synchronous path which is > called while holding cgroup_mutex and the batching is too limited to > be actually helpful. > > This patch updates the cache release path so that the batching is > asynchronous and global. All SLAB_DESTORY_BY_RCU caches are queued > globally and a work item consumes the list. The work item calls > rcu_barrier() only once for all caches that are currently queued. > > * release_caches() is removed and shutdown_cache() now either directly > release the cache or schedules a RCU callback to do that. This > makes the cache inaccessible once shutdown_cache() is called and > makes it impossible for shutdown_memcg_caches() to do memcg-specific > cleanups afterwards. Move memcg-specific part into a helper, > unlink_memcg_cache(), and make shutdown_cache() call it directly. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Reported-by: Jay Vana <jsvana@xxxxxx> > Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> > Cc: Christoph Lameter <cl@xxxxxxxxx> > Cc: Pekka Enberg <penberg@xxxxxxxxxx> > Cc: David Rientjes <rientjes@xxxxxxxxxx> > Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Acked-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html