On 05/02/2016 09:44 PM, David Matlack wrote: > On Mon, May 2, 2016 at 3:42 AM, Christian Borntraeger > <borntraeger@xxxxxxxxxx> wrote: >> Radim, Paolo, >> >> can you have a look at this patch? If you are ok with it, I want to >> submit this patch with my next s390 pull request. It touches KVM common >> code, but I tried to make it a nop for everything but s390. >> >> Christian >> >> ----snip---- >> >> >> Some wakeups should not be considered a sucessful poll. For example on >> s390 I/O interrupts are usually floating, which means that _ALL_ CPUs >> would be considered runnable - letting all vCPUs poll all the time for >> transactional like workload, even if one vCPU would be enough. >> This can result in huge CPU usage for large guests. >> This patch lets architectures provide a way to qualify wakeups if they >> should be considered a good/bad wakeups in regard to polls. >> >> For s390 the implementation will fence of halt polling for anything but >> known good, single vCPU events. The s390 implementation for floating >> interrupts does a wakeup for one vCPU, but the interrupt will be delivered >> by whatever CPU comes first. > > Can the delivery of the floating interrupt to the "first CPU" be done > by kvm_vcpu_check_block? If so, then kvm_vcpu_check_block can return > false for all other CPUs and the polling problem goes away. > The delivery of interrupts is always done inside the __vcpu_run function. So when we leave kvm_vpcu_block we will come back to __vcpu_run and deliver pending interrupts (if not masked by PSW or control registers) according to their priority. I remember that some time ago we had a reason why we could not deliver in kvm_vcpu_block but I forgot why :-/ >> To limit the halt polling we only mark the >> woken up CPU as a valid poll. This code will also cover several other >> wakeup reasons like IPI or expired timers. This will of course also mark >> some events as not sucessful. As KVM on z runs always as a 2nd level >> hypervisor, we prefer to not poll, unless we are really sure, though. >> >> So we start with a minimal set and will provide additional patches in >> the future that mark additional code paths as valid wakeups, if that >> turns out to be necessary. >> >> This patch successfully limits the CPU usage for cases like uperf 1byte >> transactional ping pong workload or wakeup heavy workload like OLTP >> while still providing a proper speedup. >> >> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> > > Reviewed-By: David Matlack <dmatlack@xxxxxxxxxx> > (I reviewed the non-s390 case, to make sure that this change is a nop.) > > Request to be cc'd on halt-polling patches in the future. Thanks! Sure. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html