Re: [RFC PATCH 0/2] support cgroup pool in v1

Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> · Wed, 8 Sep 2021 14:37:23 +0200

On Wed, Sep 08, 2021 at 08:15:11PM +0800, Yi Tao wrote:
> In a scenario where containers are started with high concurrency, in
> order to control the use of system resources by the container, it is
> necessary to create a corresponding cgroup for each container and
> attach the process. The kernel uses the cgroup_mutex global lock to
> protect the consistency of the data, which results in a higher
> long-tail delay for cgroup-related operations during concurrent startup.
> For example, long-tail delay of creating cgroup under each subsystems
> is 900ms when starting 400 containers, which becomes bottleneck of
> performance. The delay is mainly composed of two parts, namely the
> time of the critical section protected by cgroup_mutex and the
> scheduling time of sleep. The scheduling time will increase with
> the increase of the cpu overhead.

Perhaps you shouldn't be creating that many containers all at once?
What normal workload requires this?

thanks,

greg k-h