Re: [PATCH] Temporarily shutoff CPUSET cgroup in libvirt.

Eric Blake <eblake@xxxxxxxxxx> · Fri, 21 Sep 2012 13:04:42 -0600

On 09/21/2012 02:20 AM, Tang Chen wrote:
> When a cpu or memory is offlined and onlined again, cgroup will not
> recover any corresponding values. This is a problem in kernel.
> Do not use CPUSET cgroup to limit threads using cpu and memory until
> the problem is fixed in kernel.
> 
> Signed-off-by: Tang Chen <tangchen@xxxxxxxxxxxxxx>
> ---
>  src/qemu/qemu.conf   |    5 ++++-
>  src/qemu/qemu_conf.c |    7 ++++++-
>  src/util/cgroup.c    |    3 +--
>  3 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf
> index 18105ca..6bf7290 100644
> --- a/src/qemu/qemu.conf
> +++ b/src/qemu/qemu.conf
> @@ -211,7 +211,10 @@
>  # can be mounted in different locations. libvirt will detect
>  # where they are located.
>  #
> -#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuset", "cpuacct" ]
> +# When a cpu or memory is offlined and onlined again, cgroup will not
> +# recover any corresponding values. This is a problem in kernel.
> +# So DO NOT use cpuset cgroup until this problem is fixed in kernel.
> +#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuacct" ]

We ought to list which versions of the kernel are known to have the bug
(that is, say something like 'at least through kernel 3.6'); when we
finally know which kernel version fixes the bugs in the cpuset
controller, then we can update the text again to give the user better
advice on whether they want to be using this.

I'm torn on whether to take this patch prior to 0.10.2 - it makes sense
to take (that is, the kernel is buggy, and we cause degraded performance
as a result of the kernel bug any time the system goes through S3), but
it's rather late, having missed rc2, for such a major change in default
functionality.  Am I correct, though, that you can manually edit
/etc/libvirt/qemu.conf to re-enable the use of cpuset even on a buggy
kernel, for further testing the issues?

>  
>  # This is the basic set of devices allowed / required by
>  # all virtual machines.
> diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
> index 91a56f1..80b0787 100644
> --- a/src/qemu/qemu_conf.c
> +++ b/src/qemu/qemu_conf.c
> @@ -397,12 +397,17 @@ int qemudLoadDriverConfig(struct qemud_driver *driver,
>              driver->cgroupControllers |= (1 << ctl);
>          }
>      } else {
> +        /*
> +         * When a cpu or memory is offlined and onlined again, cgroup will not
> +         * recover any corresponding values. This is a problem in kernel.
> +         * Do not use CPUSET cgroup to limit threads using cpu and memory until
> +         * the problem is fixed in kernel.
> +         */
>          driver->cgroupControllers =
>              (1 << VIR_CGROUP_CONTROLLER_CPU) |
>              (1 << VIR_CGROUP_CONTROLLER_DEVICES) |
>              (1 << VIR_CGROUP_CONTROLLER_MEMORY) |
>              (1 << VIR_CGROUP_CONTROLLER_BLKIO) |
> -            (1 << VIR_CGROUP_CONTROLLER_CPUSET) |
>              (1 << VIR_CGROUP_CONTROLLER_CPUACCT);

Rather than hard-coding the non-use of cpuset as our default, can we do
something like a uname() probe or some other clue of whether we have a
buggy kernel?

> +++ b/src/util/cgroup.c
> @@ -543,8 +543,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group,
>          /* We need to control cpu bandwidth for each vcpu now */
>          if ((flags & VIR_CGROUP_VCPU) &&
>              (i != VIR_CGROUP_CONTROLLER_CPU &&
> -             i != VIR_CGROUP_CONTROLLER_CPUACCT &&
> -             i != VIR_CGROUP_CONTROLLER_CPUSET)) {
> +             i != VIR_CGROUP_CONTROLLER_CPUACCT)) {
>              /* treat it as unmounted and we can use virCgroupAddTask */
>              VIR_FREE(group->controllers[i].mountPoint);
>              continue;

Is this hunk necessary?  In other words, don't we just gracefully skip
making per-vcpu groups for any controller that was not mounted, or do we
loudly fail if all of these controllers are not present?

-- 
Eric Blake   eblake@xxxxxxxxxx    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list