On Fri, Dec 24, 2010 at 02:49:43AM -0800, David Rientjes wrote: > On Thu, 23 Dec 2010, Ben Blum wrote: > > > On Thu, Dec 16, 2010 at 12:26:03AM -0800, Andrew Morton wrote: > > > Patches have gone a bit stale, sorry. Refactoring in > > > kernel/cgroup_freezer.c necessitates a refresh and retest please. > > > > commit 53feb29767c29c877f9d47dcfe14211b5b0f7ebd changed a bunch of stuff > > in kernel/cpuset.c to allocate nodemasks with NODEMASK_ALLOC (which > > wraps kmalloc) instead of on the stack. > > > > It only wraps kmalloc for CONFIG_NODES_SHIFT > 8. > > > 1. All these functions have 'void' return values, indicating that > > calling them them must not fail. Sure there are bailout cases, but no > > semblance of cross-function error propagation. Most importantly, > > cpuset_attach is a subsystem callback, which MUST not fail given the > > way it's used in cgroups, so relying on kmalloc is not safe. > > > > Yes, that's a valid concern that should be addressed. depending on the circumstances that would cause such a failure, and the consequences of "ignoring" it, would doing a cop-out with WARN_ON() be appropriate? > > 2. I'm working on a patch series which needs to hold tasklist_lock > > across ss->attach callbacks (in cpuset_attach's "if (threadgroup)" > > case, this is how safe traversal of tsk->thread_group will be > > ensured), and kmalloc isn't allowed while holding a spin-lock. > > > > kmalloc() is allowed while holding a spinlock and NODEMASK_ALLOC() takes a > gfp_flags argument for that reason. Ah, it's only with GFP_KERNEL and friends. So changing the uses in cpuset_can_attach to GFP_ATOMIC would solve this concern, then? Then there are no extra issues with my patchset, but note: the use cases for calling ->attach are going to be hairy enough that it won't be reasonable to require the caller to back-track if an allocation fails... > > Why do we need heap-allocation here at all? In each case their scope is > > exactly the function's scope, and neither the commit nor the surrounding > > patch series give any explanation. I'd like to revert the patch if > > possible. > > > > Because some kernels, such as those with CONFIG_NODES_SHIFT > 8, cause > stack overflows with the large nodemasks. Ah. I couldn't find any documentation for what the maximum number of memory controllers could be. this makes sense. Thanks! -- Ben _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers