On Thu 19-12-13 12:00:58, Glauber Costa wrote: > On Thu, Dec 19, 2013 at 11:07 AM, Vladimir Davydov > <vdavydov@xxxxxxxxxxxxx> wrote: > > On 12/18/2013 09:41 PM, Michal Hocko wrote: > >> On Wed 18-12-13 17:16:55, Vladimir Davydov wrote: > >>> The memcg_params::memcg_caches array can be updated concurrently from > >>> memcg_update_cache_size() and memcg_create_kmem_cache(). Although both > >>> of these functions take the slab_mutex during their operation, the > >>> latter checks if memcg's cache has already been allocated w/o taking the > >>> mutex. This can result in a race as described below. > >>> > >>> Asume two threads schedule kmem_cache creation works for the same > >>> kmem_cache of the same memcg from __memcg_kmem_get_cache(). One of the > >>> works successfully creates it. Another work should fail then, but if it > >>> interleaves with memcg_update_cache_size() as follows, it does not: > >> I am not sure I understand the race. memcg_update_cache_size is called > >> when we start accounting a new memcg or a child is created and it > >> inherits accounting from the parent. memcg_create_kmem_cache is called > >> when a new cache is first allocated from, right? > > > > memcg_update_cache_size() is called when kmem accounting is activated > > for a memcg, no matter how. > > > > memcg_create_kmem_cache() is scheduled from __memcg_kmem_get_cache(). > > It's OK to have a bunch of such methods trying to create the same memcg > > cache concurrently, but only one of them should succeed. > > > >> Why cannot we simply take slab_mutex inside memcg_create_kmem_cache? > >> it is running from the workqueue context so it should clash with other > >> locks. > > > > Hmm, Glauber's code never takes the slab_mutex inside memcontrol.c. I > > have always been wondering why, because it could simplify flow paths > > significantly (e.g. update_cache_sizes() -> update_all_caches() -> > > update_cache_size() - from memcontrol.c to slab_common.c and back again > > just to take the mutex). > > > > Because that is a layering violation and exposes implementation > details of the slab to > the outside world. I agree this would make things a lot simpler, but > please check with Christoph > if this is acceptable before going forward. We do not have to expose the lock directly. We can hide it behind a helper function. Relying on the lock silently at many places is worse then expose it IMHO. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html