Re: [PATCH v5 3/3] cgroups: make procs file writable

David Rientjes <rientjes@xxxxxxxxxx> · Fri, 24 Dec 2010 02:49:43 -0800 (PST)

On Thu, 23 Dec 2010, Ben Blum wrote:

> On Thu, Dec 16, 2010 at 12:26:03AM -0800, Andrew Morton wrote:
> > Patches have gone a bit stale, sorry.  Refactoring in
> > kernel/cgroup_freezer.c necessitates a refresh and retest please.
> 
> commit 53feb29767c29c877f9d47dcfe14211b5b0f7ebd changed a bunch of stuff
> in kernel/cpuset.c to allocate nodemasks with NODEMASK_ALLOC (which
> wraps kmalloc) instead of on the stack.
> 

It only wraps kmalloc for CONFIG_NODES_SHIFT > 8.

> 1. All these functions have 'void' return values, indicating that
>    calling them them must not fail. Sure there are bailout cases, but no
>    semblance of cross-function error propagation. Most importantly,
>    cpuset_attach is a subsystem callback, which MUST not fail given the
>    way it's used in cgroups, so relying on kmalloc is not safe.
> 

Yes, that's a valid concern that should be addressed.

> 2. I'm working on a patch series which needs to hold tasklist_lock
>    across ss->attach callbacks (in cpuset_attach's "if (threadgroup)"
>    case, this is how safe traversal of tsk->thread_group will be
>    ensured), and kmalloc isn't allowed while holding a spin-lock. 
> 

kmalloc() is allowed while holding a spinlock and NODEMASK_ALLOC() takes a 
gfp_flags argument for that reason.

> Why do we need heap-allocation here at all? In each case their scope is
> exactly the function's scope, and neither the commit nor the surrounding
> patch series give any explanation. I'd like to revert the patch if
> possible.
> 

Because some kernels, such as those with CONFIG_NODES_SHIFT > 8, cause 
stack overflows with the large nodemasks.
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers