On 07/21/2011 08:34 AM, Daniel P. Berrange wrote: > On Thu, Jul 21, 2011 at 07:54:05AM -0500, Adam Litke wrote: >> Added Anthony to give him the opportunity to address the finer points of >> this one especially with respect to the qemu IO thread(s). >> >> This feature is really about capping the compute performance of a VM >> such that we get consistent top end performance. Yes, qemu has non-VCPU >> threads that this patch set doesn't govern, but that's the point. We >> are not attempting to throttle IO or device emulation with this feature. >> It's true that an IO-intensive guest may consume more host resources >> than a compute intensive guest, but they should still have equal top-end >> CPU performance when viewed from the guest's perspective. > > I could be mis-understanding, what you're trying to achieve, > here, so perhaps we should consider an example. >From your example, it's clear to me that you understand the use case well. > - A machine has 4 physical CPUs > - There are 4 guests on the machine > - Each guest has 2 virtual CPUs > > So we've overcommit the host CPU resources x2 here. > > Lets say that we want to use this feature to ensure consistent > top end performance of every guest, splitting the host pCPUs > resources evenly across all guests, so each guest is ensured > 1 pCPU worth of CPU time overall. > > This patch lets you do this by assigning caps per VCPU. So > in this example, each VCPU cgroup would have to be configured > to cap the VCPUs at 50% of a single pCPU. > > This leaves the other QEMU threads uncapped / unaccounted > for. If any one guest causes non-trivial compute load in > a non-VCPU thread, this can/will impact the top-end compute > performance of all the other guests on the machine. > > If we did caps per VM, then you could set the VM cgroup > such that the VM as a whole had 100% of a single pCPU. > > If a guest is 100% compute bound, it can use its full > 100% of a pCPU allocation in vCPU threads. If any other > guest is causing CPU time in a non-VCPU thread, it cannot > impact the top end compute performance of VCPU threads in > the other guests. > > A per-VM cap would, however, mean a guest with 2 vCPUs > could have unequal scheduling, where one vCPU claimed 75% > of the pCPU and the othe vCPU got left with only 25%. > > So AFAICT, per-VM cgroups is better for ensuring top > end compute performance of a guest as a whole, but > per-VCPU cgroups can ensure consistent top end performance > across vCPUs within a guest. > > IMHO, per-VM cgroups is the more useful because it is the > only way to stop guests impacting each other, but there > could be additional benefits of *also* have per-VCPU cgroups > if you want to ensure fairness of top-end performance across > vCPUs inside a single VM. What this says to me is that per-VM cgroups _in_addition_to_ per-vcpu cgroups is the _most_ useful situation. Since I can't think of any cases where someone would want per-vm and not per-vcpu, how about we always do both when supported. We can still use one pair of tunables (<period> and <quota>) and try to do the right thing. For example: <vcpus>2</vcpus> <cputune> <period>500000</period> <quota>250000</quota> </cputune> Would have the following behavior for qemu-kvm (vcpu threads) Global VM cgroup: cfs_period:500000 cfs_quota:500000 Each vcpu cgroup: cfs_period:500000 cfs_quota:250000 and this behavior for qemu with no vcpu threads Global VM cgroup: cfs_period:500000 cfs_quota:500000 It's true that IO could still throw off the scheduling balance somewhat among vcpus _within_ a VM, but this effect would be confined within the vm itself. Best of both worlds? -- Adam Litke IBM Linux Technology Center -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list