On 12/19/2013 01:12 PM, Michal Hocko wrote: > On Thu 19-12-13 12:00:58, Glauber Costa wrote: >> On Thu, Dec 19, 2013 at 11:07 AM, Vladimir Davydov >> <vdavydov@xxxxxxxxxxxxx> wrote: >>> On 12/18/2013 09:41 PM, Michal Hocko wrote: >>>> On Wed 18-12-13 17:16:55, Vladimir Davydov wrote: >>>>> The memcg_params::memcg_caches array can be updated concurrently from >>>>> memcg_update_cache_size() and memcg_create_kmem_cache(). Although both >>>>> of these functions take the slab_mutex during their operation, the >>>>> latter checks if memcg's cache has already been allocated w/o taking the >>>>> mutex. This can result in a race as described below. >>>>> >>>>> Asume two threads schedule kmem_cache creation works for the same >>>>> kmem_cache of the same memcg from __memcg_kmem_get_cache(). One of the >>>>> works successfully creates it. Another work should fail then, but if it >>>>> interleaves with memcg_update_cache_size() as follows, it does not: >>>> I am not sure I understand the race. memcg_update_cache_size is called >>>> when we start accounting a new memcg or a child is created and it >>>> inherits accounting from the parent. memcg_create_kmem_cache is called >>>> when a new cache is first allocated from, right? >>> memcg_update_cache_size() is called when kmem accounting is activated >>> for a memcg, no matter how. >>> >>> memcg_create_kmem_cache() is scheduled from __memcg_kmem_get_cache(). >>> It's OK to have a bunch of such methods trying to create the same memcg >>> cache concurrently, but only one of them should succeed. >>> >>>> Why cannot we simply take slab_mutex inside memcg_create_kmem_cache? >>>> it is running from the workqueue context so it should clash with other >>>> locks. >>> Hmm, Glauber's code never takes the slab_mutex inside memcontrol.c. I >>> have always been wondering why, because it could simplify flow paths >>> significantly (e.g. update_cache_sizes() -> update_all_caches() -> >>> update_cache_size() - from memcontrol.c to slab_common.c and back again >>> just to take the mutex). >>> >> Because that is a layering violation and exposes implementation >> details of the slab to >> the outside world. I agree this would make things a lot simpler, but >> please check with Christoph >> if this is acceptable before going forward. > We do not have to expose the lock directly. We can hide it behind a > helper function. Relying on the lock silently at many places is worse > then expose it IMHO. BTW, the lock is already exposed by mm/slab.h, which is included into mm/memcontrol.c :-) So we have immediate access to the lock right now. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>