On Tue, Feb 04, 2020 at 12:53:51PM +0100, Peter Zijlstra wrote: > On Tue, Jan 21, 2020 at 04:48:43PM +0100, Christian Brauner wrote: > > This adds support for creating a process in a different cgroup than its > > parent. Callers can limit and account processes and threads right from > > the moment they are spawned: > > - A service manager can directly spawn new services into dedicated > > cgroups. > > - A process can be directly created in a frozen cgroup and will be > > frozen as well. > > - The initial accounting jitter experienced by process supervisors and > > daemons is eliminated with this. > > - Threaded applications or even thread implementations can choose to > > create a specific cgroup layout where each thread is spawned > > directly into a dedicated cgroup. > > > > This feature is limited to the unified hierarchy. Callers need to pass > > an directory file descriptor for the target cgroup. The caller can > > choose to pass an O_PATH file descriptor. All usual migration > > restrictions apply, i.e. there can be no processes in inner nodes. In > > general, creating a process directly in a target cgroup adheres to all > > migration restrictions. > > AFAICT, he *big* win here is avoiding the write side of the > cgroup_threadgroup_rwsem. Or am I mis-reading the patch? No, you're absolutely right. I just didn't bother putting implementation specifics in the cover letter and I probably should have. So thanks for pointing that out! > > That global lock is what makes moving tasks/threads around super > expensive, avoiding that by use of this clone() variant wins the day. :) Christian