Re: [PATCH] cgroup: Don't drop the cgroup_mutex in cgroup_rmdir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 19, 2012 at 07:39:32PM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
> 
> We dropped cgroup mutex, because of a deadlock between memcg and cpuset.
> cpuset took hotplug lock followed by cgroup_mutex, where as memcg pre_destroy
> did lru_add_drain_all() which took hotplug lock while already holding
> cgroup_mutex. The deadlock is explained in 3fa59dfbc3b223f02c26593be69ce6fc9a940405
> But dropping cgroup_mutex in cgroup_rmdir also means tasks could get
> added to cgroup while we are in pre_destroy. This makes error handling in
> pre_destroy complex. So move the unlock/lock to memcg pre_destroy callback.
> Core cgroup will now call pre_destroy with cgroup_mutex held.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>

I generally think cgroup_mutex shouldn't be dropped across any cgroup
hierarchy changing operation and thus agree with the cgroup core
change.

>  static int mem_cgroup_pre_destroy(struct cgroup *cont)
>  {
> +	int ret;
>  	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
>  
> -	return mem_cgroup_force_empty(memcg, false);
> +	cgroup_unlock();
> +	/*
> +	 * we call lru_add_drain_all, which end up taking
> +	 * mutex_lock(&cpu_hotplug.lock), But cpuset have
> +	 * the reverse order. So drop the cgroup lock
> +	 */
> +	ret = mem_cgroup_force_empty(memcg, false);
> +	cgroup_lock();
> +	return ret;
>  }

This reverse dependency from cpuset is the same problem Glauber
reported a while ago.  I don't know why / how cgroup_mutex got
exported to outside world but this is asking for trouble.  cgroup
mutex protects cgroup hierarchies.  There are many core subsystems
which implement cgroup controllers.  Controller callbacks for
hierarchy changing oeprations need to synchronize with the rest of the
core subsystems.  So, by design, in locking hierarchy, cgroup_mutex
has to be one of the outermost locks.  If somebody tries to grab it
from inside other core subsystem locks, there of course will be
circular locking dependencies.

So, Peter, why does cpuset mangle with cgroup_mutex?  What guarantees
does it need?  Why can't it work on "changed" notification while
caching the current css like blkcg does?

Let's please unexport cgroup_mutex.  It's moronic to expose that
outside cgroup.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]