At 07/21/2011 11:25 PM, Adam Litke Write: > > > On 07/21/2011 09:29 AM, Daniel P. Berrange wrote: >> On Thu, Jul 21, 2011 at 08:49:28AM -0500, Adam Litke wrote: >>> On 07/21/2011 08:34 AM, Daniel P. Berrange wrote: >>>> On Thu, Jul 21, 2011 at 07:54:05AM -0500, Adam Litke wrote: >>>>> Added Anthony to give him the opportunity to address the finer points of >>>>> this one especially with respect to the qemu IO thread(s). >>>>> >>>>> This feature is really about capping the compute performance of a VM >>>>> such that we get consistent top end performance. Yes, qemu has non-VCPU >>>>> threads that this patch set doesn't govern, but that's the point. We >>>>> are not attempting to throttle IO or device emulation with this feature. >>>>> It's true that an IO-intensive guest may consume more host resources >>>>> than a compute intensive guest, but they should still have equal top-end >>>>> CPU performance when viewed from the guest's perspective. >>>> >>>> I could be mis-understanding, what you're trying to achieve, >>>> here, so perhaps we should consider an example. >>> >>> From your example, it's clear to me that you understand the use case well. >>> >>>> - A machine has 4 physical CPUs >>>> - There are 4 guests on the machine >>>> - Each guest has 2 virtual CPUs >>>> >>>> So we've overcommit the host CPU resources x2 here. >>>> >>>> Lets say that we want to use this feature to ensure consistent >>>> top end performance of every guest, splitting the host pCPUs >>>> resources evenly across all guests, so each guest is ensured >>>> 1 pCPU worth of CPU time overall. >>>> >>>> This patch lets you do this by assigning caps per VCPU. So >>>> in this example, each VCPU cgroup would have to be configured >>>> to cap the VCPUs at 50% of a single pCPU. >>>> >>>> This leaves the other QEMU threads uncapped / unaccounted >>>> for. If any one guest causes non-trivial compute load in >>>> a non-VCPU thread, this can/will impact the top-end compute >>>> performance of all the other guests on the machine. >>>> >>>> If we did caps per VM, then you could set the VM cgroup >>>> such that the VM as a whole had 100% of a single pCPU. >>>> >>>> If a guest is 100% compute bound, it can use its full >>>> 100% of a pCPU allocation in vCPU threads. If any other >>>> guest is causing CPU time in a non-VCPU thread, it cannot >>>> impact the top end compute performance of VCPU threads in >>>> the other guests. >>>> >>>> A per-VM cap would, however, mean a guest with 2 vCPUs >>>> could have unequal scheduling, where one vCPU claimed 75% >>>> of the pCPU and the othe vCPU got left with only 25%. >>>> >>>> So AFAICT, per-VM cgroups is better for ensuring top >>>> end compute performance of a guest as a whole, but >>>> per-VCPU cgroups can ensure consistent top end performance >>>> across vCPUs within a guest. >>>> >>>> IMHO, per-VM cgroups is the more useful because it is the >>>> only way to stop guests impacting each other, but there >>>> could be additional benefits of *also* have per-VCPU cgroups >>>> if you want to ensure fairness of top-end performance across >>>> vCPUs inside a single VM. >>> >>> What this says to me is that per-VM cgroups _in_addition_to_ per-vcpu >>> cgroups is the _most_ useful situation. Since I can't think of any >>> cases where someone would want per-vm and not per-vcpu, how about we >>> always do both when supported. We can still use one pair of tunables >>> (<period> and <quota>) and try to do the right thing. For example: >>> >>> <vcpus>2</vcpus> >>> <cputune> >>> <period>500000</period> >>> <quota>250000</quota> >>> </cputune> >>> >>> Would have the following behavior for qemu-kvm (vcpu threads) >>> >>> Global VM cgroup: cfs_period:500000 cfs_quota:500000 >>> Each vcpu cgroup: cfs_period:500000 cfs_quota:250000 >>> >>> and this behavior for qemu with no vcpu threads >> >> So, whatever quota value is in the XML, you would multiply that >> by the number of vCPUS and use it to set the VM quota value ? > > Yep. > >> I'm trying to think if there is ever a case where you don't want >> the VM to be a plain multiple of the VCPU value, but I can't >> think of one. >> >> So only the real discussion point here, is whether the quota >> value in the XML, is treated as a per-VM value, or a per-VCPU >> value. > > I think it has to be per-VCPU. Otherwise the user will have to remember > to do the multiplication themselves. If they forget to do this they > will get a nasty performance surprise. > >> cpu_shares is treated as a per-VM value, period doesn't matter >> but cpu_quota would be a per-VCPU value, multiplied to get a >> per-VM value when needed. I still find this mis-match rather >> wierd to be fair. > > Yes, this is unfortunate. But cpu_shares is a comparative value whereas > quota is quantitative. In the future we could apply 'shares' at the > vcpu level too. In that case we'd just pick some arbitrary number and > apply it to each vcpu cgroup. > >> So current behaviour >> >> if vCPU threads >> set quota in vCPU group >> else >> set nVCPUs * quota in VM group >> >> Would change to >> >> set nVCPUs * quota in VM group >> if vCPU threads >> set quota in vCPU group >> >> ? > > Yes. I treat this answer as you agree with Daniel P. Berrange's idea. If so, I will implement it. > >> We need to remember to update the VM cgroup if we change the >> number of vCPUs on a running guest of course > > When can this happen? Does libvirt support cpu hotplug? > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list