(2012/04/12 3:57), Frederic Weisbecker wrote: > Hi, > > While talking with Tejun about targetting the cgroup task counter subsystem > for the next merge window, he suggested to check if this could be merged into > the memcg subsystem rather than creating a new one cgroup subsystem just > for task count limit purpose. > > So I'm pinging you guys to seek your insight. > > I assume not everybody in the Cc list knows what the task counter subsystem > is all about. So here is a summary: this is a cgroup subsystem (latest version > in https://lwn.net/Articles/478631/) that keeps track of the number of tasks > present in a cgroup. Hooks are set in task fork/exit and cgroup migration to > maintain this accounting visible to a special tasks.usage file. The user can > set a limit on the number of tasks by writing on the tasks.limit file. > Further forks or cgroup migration are then rejected if the limit is exceeded. > > This feature is especially useful to protect against forkbombs in containers. > Or more generally to limit the resources on the number of tasks on a cgroup > as it involves some kernel memory allocation. > > Now the dilemna is how to implement it? > > 1) As a standalone subsystem, as it stands currently (https://lwn.net/Articles/478631/) > > 2) As a feature in memcg, part of the memory.kmem.* files. This makes sense > because this is about kernel memory allocation limitation. We could have a > memory.kmem.tasks.count > > My personal opinion is that the task counter brings some overhead: a charge > across the whole hierarchy at every fork, and the mirrored uncharge on task exit. > And this overhead happens even in the off-case (when the task counter susbsystem > is mounted but the limit is the default: ULLONG_MAX). > > So if we choose the second solution, this overhead will be added unconditionally > to memcg. > But I don't expect every users of memcg will need the task counter. So perhaps > the overhead should be kept in its own separate subsystem. > > OTOH memory.kmem.* interface would have be a good fit. > > What do you think? Sounds interesting to me. Hm, does your 'overhead' of task accounting is enough large to be visible to users ? How performance regression is big ? BTW, now, all memcg's limit interfaces use 'bytes' as an unit of accounting. It's a small concern to me to have mixture of bytes and numbers of objects for accounting. But I think increasing number of subsystem is not very good.... Regards, -Kame _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers