Hello, Waiman. On Wed, Apr 12, 2023 at 11:37:53AM -0400, Waiman Long wrote: > This patch series introduces a new "isolcpus" partition type to the > existing list of {member, root, isolated} types. The primary reason > of adding this new "isolcpus" partition is to facilitate the > distribution of isolated CPUs down the cgroup v2 hierarchy. > > The other non-member partition types have the limitation that their > parents have to be valid partitions too. It will be hard to create a > partition a few layers down the hierarchy. > > It is relatively rare to have applications that require creation of > a separate scheduling domain (root). However, it is more common to > have applications that require the use of isolated CPUs (isolated), > e.g. DPDK. One can use the "isolcpus" or "nohz_full" boot command options > to get that statically. Of course, the "isolated" partition is another > way to achieve that dynamically. > > Modern container orchestration tools like Kubernetes use the cgroup > hierarchy to manage different containers. If a container needs to use > isolated CPUs, it is hard to get those with existing set of cpuset > partition types. With this patch series, a new "isolcpus" partition > can be created to hold a set of isolated CPUs that can be pull into > other "isolated" partitions. > > The "isolcpus" partition is special that there can have at most one > instance of this in a system. It serves as a pool for isolated CPUs > and cannot hold tasks or sub-cpusets underneath it. It is also not > cpu-exclusive so that the isolated CPUs can be distributed down the > sibling hierarchies, though those isolated CPUs will not be useable > until the partition type becomes "isolated". > > Once isolated CPUs are needed in a cgroup, the administrator can write > a list of isolated CPUs into its "cpuset.cpus" and change its partition > type to "isolated" to pull in those isolated CPUs from the "isolcpus" > partition and use them in that cgroup. That will make the distribution > of isolated CPUs to cgroups that need them much easier. I'm not sure about this. It feels really hacky in that it side-steps the distribution hierarchy completely. I can imagine a non-isolated cpuset wanting to allow isolated cpusets downstream but that should be done hierarchically - e.g. by allowing a cgroup to express what isolated cpus are allowed in the subtree. Also, can you give more details on the targeted use cases? Thanks. -- tejun