> I am by far no expert. vCPUs are "just" (e.g. QEMU) threads and treated > that way by the Linux scheduler. So, whatever scheduler policy you > configure for these threads, it will be used in the kernel. Thank you for answering David, this confirms my current understanding. KVM/QEMU sources the underlying CPU scheduler to avoid re-inventing the wheel. Whatever the base OS uses, KVM/QEMU will use as well. I'm interested in whether the underlying CPU scheduler (in this case CFS, but it could be any scheduler as you've stated) attempts co-scheduling or similar mechanism for virtual machines. It would make sense that there would be, as operating systems tend to assume simultaneous access to cores, but so far I haven't found anything to support this. > As you're pushing 8 virtual cpus on 3 logical cpus, there will never be > s such a thing as "synchronous progress". This is even true when having > 2 virtual cpus on 3 logical cpus. At least not in default scenarios. > There will be other processes/threads to schedule. I should have phrased this better. In this example, the host has access to at least 24 cores but only has 3 cores available to the guest VM at this 'instance' in time (as the others have been allocated to other VMs). My apologies for not adequately conveying my intent. > I hope somebody else with more insight can answer that. But in general, > to avoid package drops you might want to look into (v)CPU pinning / KVM > RT. But this will require to have *at least* the number of logical cores > as you have virtual CPUs. You're exactly right, and we do this in our latency sensitive environments for a performance boost (in addition to cpu isolation, PCI/SRIOV passthrough, disabling c/p states, and a few other tweaks). However this approach isn't possible with oversubscription, the primary draw to virtualizing the environment. On Tue, May 22, 2018 at 9:38 AM, David Hildenbrand <david@xxxxxxxxxx> wrote: > On 18.05.2018 19:22, William Scott wrote: >> Greetings! >> >> I'm encountering difficulty understanding the Linux CPU Scheduler >> within the context of KVM virtual machines. Specifically, I'd like to >> understand if/how groups of logical cores are allocated to virtual >> machines in an oversubscribed environment. > > Hi, > > I am by far no expert. vCPUs are "just" (e.g. QEMU) threads and treated > that way by the Linux scheduler. So, whatever scheduler policy you > configure for these threads, it will be used in the kernel. > >> >> At a high level, my question is "how does the scheduler handle >> allocation of logical cores to a VM that is provisioned more cores >> than is currently available? E.g., the host has 3 logical cores >> available but a VM is provisioned with 8 vCPUs." I'm predominantly >> concerned with the guest operating system not observing synchronous >> progress across all vCPUs and potential related errors e.g. a watchdog >> timer might expect a response from a sibling vCPU (which was not > > As you're pushing 8 virtual cpus on 3 logical cpus, there will never be > s such a thing as "synchronous progress". This is even true when having > 2 virtual cpus on 3 logical cpus. At least not in default scenarios. > There will be other processes/threads to schedule. > > Now, let's assume watchdogs are usually ~30seconds. So if you're > hypervisor is heavily overloaded, it can of course happen that a > watchdog strikes, or rather some RCU deadlock prevention in your guest > operating system will trigger before that. > > We apply some techniques to optimize some scenarios. E.g. VCPU yielding, > paravirtualized spinlocks etc, to avoid a guest VCPU to waste CPU cycles > waiting for conditions that require other VCPUs to run first. > >> allocated a logical core to run on) within a specified time. I expect >> KVM to use the completely fair scheduler (CFS) and a variation of >> co-scheduling/gang scheduling, but I've been unable to discern whether >> this is true (it was mentioned in a lwn.net article in 2011, but >> hasn't been expanded upon since https://lwn.net/Articles/472797/). >> >> I've discovered ESXi approaches this with relaxed co-scheduling >> https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-vsphere-cpu-sched-performance-white-paper.pdf >> (pg. 7). >> >> I've also discovered a similar email discussion directed at a >> different mailing list (which suggested to mail this one), >> https://lists.centos.org/pipermail/centos-virt/2010-November/002214.html >> >> For context on my end, I am operating two virtual machine 'stacks' in >> an a heavily oversubscribed OpenStack KVM cloud environment. Each >> 'stack' consists of two virtual machines. The first generates network >> traffic (a 'traffic generator') and sends this traffic through two >> separate interfaces to corresponding networks. The second virtual >> machine acts as a bridge for these networks. A rudimentary diagram is >> shown below. >> >> .----[traffic generator]----. >> | | >> '--------[VM bridge]--------' >> >> Interestingly; >> * When the VM bridge is provisioned with 2 vCPUs, the traffic >> generator reports ~ 10% packet loss >> * When the VM bridge is provisioned with 4 vCPUs, the traffic >> generator reports ~ 40% packet loss >> >> I suspect the packet loss originates from the virtual interface buffer >> overflow. To the best of my understanding, although the completely >> fair scheduler would schedule the VMs for equivalent durations, the >> 2vCPU VM will be scheduled more frequently (for shorter periods) >> because it is easier for the scheduler to find and allocate 2vCPUs >> than 4vCPUs. This will allow the buffers to be emptied more regularly >> which results in less packet loss. However, in order to >> prove/disprove this theory, I'd need to know how the completely fair >> scheduler handles co-scheduling in the context of KVM virtual >> machines. > > I hope somebody else with more insight can answer that. But in general, > to avoid package drops you might want to look into (v)CPU pinning / KVM > RT. But this will require to have *at least* the number of logical cores > as you have virtual CPUs. > >> >> Thank you kindly, >> William >> > > > -- > > Thanks, > > David / dhildenb