On Sun, Jun 26, 2016 at 2:55 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote: > We currently keep every task's stack around until the task_struct > itself is freed. This means that we keep the stack allocation alive > for longer than necessary and that, under load, we free stacks in > big batches whenever RCU drops the last task reference. Neither of > these is good for reuse of cache-hot memory, and freeing in batches > prevents us from usefully caching small numbers of vmalloced stacks. > > On architectures that have thread_info on the stack, we can't easily > change this, but on architectures that set THREAD_INFO_IN_TASK, we > can free it as soon as the task is dead. This is broken: > -void free_task(struct task_struct *tsk) > +void release_task_stack(struct task_struct *tsk) > { > account_kernel_stack(tsk, -1); > arch_release_thread_stack(tsk->stack); > free_thread_stack(tsk); > + tsk->stack = NULL; > +#ifdef CONFIG_VMAP_STACK > + tsk->stack_vm_area = NULL; > +#endif > +} > + > +void free_task(struct task_struct *tsk) > +{ > +#ifndef CONFIG_THREAD_INFO_IN_TASK > + /* > + * The task is finally done with both the stack and thread_info, > + * so free both. > + */ > + release_task_stack(tsk); > +#else > + /* > + * If the task had a separate stack allocation, it should be gone > + * by now. > + */ > + WARN_ON_ONCE(tsk->stack); > +#endif We can get to free_task without first going through TASK_DEAD if we fail to clone(). I'm inclined to make release_task_stack be safe to call more than once and to call it unconditionally in free_task, since doing it without branches (calling release_task_stack in the copy_process failure path) will require more ifdeffery and sounds like more trouble than it's worth. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html